first commit

This commit is contained in:
2025-04-24 13:11:28 +08:00
commit ff9c54d5e4
5960 changed files with 834111 additions and 0 deletions

12
services/chat/.gitignore vendored Normal file
View File

@@ -0,0 +1,12 @@
**.swp
public/build/
node_modules/
plato/
**/*.map
# managed by dev-environment$ bin/update_build_scripts
.npmrc

View File

@@ -0,0 +1,3 @@
{
"require": "test/setup.js"
}

1
services/chat/.nvmrc Normal file
View File

@@ -0,0 +1 @@
20.18.2

27
services/chat/Dockerfile Normal file
View File

@@ -0,0 +1,27 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
FROM node:20.18.2 AS base
WORKDIR /overleaf/services/chat
# Google Cloud Storage needs a writable $HOME/.config for resumable uploads
# (see https://googleapis.dev/nodejs/storage/latest/File.html#createWriteStream)
RUN mkdir /home/node/.config && chown node:node /home/node/.config
FROM base AS app
COPY package.json package-lock.json /overleaf/
COPY services/chat/package.json /overleaf/services/chat/
COPY libraries/ /overleaf/libraries/
COPY patches/ /overleaf/patches/
RUN cd /overleaf && npm ci --quiet
COPY services/chat/ /overleaf/services/chat/
FROM app
USER node
CMD ["node", "--expose-gc", "app.js"]

662
services/chat/LICENSE Normal file
View File

@@ -0,0 +1,662 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<http://www.gnu.org/licenses/>.

156
services/chat/Makefile Normal file
View File

@@ -0,0 +1,156 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
BUILD_NUMBER ?= local
BRANCH_NAME ?= $(shell git rev-parse --abbrev-ref HEAD)
PROJECT_NAME = chat
BUILD_DIR_NAME = $(shell pwd | xargs basename | tr -cd '[a-zA-Z0-9_.\-]')
DOCKER_COMPOSE_FLAGS ?= -f docker-compose.yml
DOCKER_COMPOSE := BUILD_NUMBER=$(BUILD_NUMBER) \
BRANCH_NAME=$(BRANCH_NAME) \
PROJECT_NAME=$(PROJECT_NAME) \
MOCHA_GREP=${MOCHA_GREP} \
docker compose ${DOCKER_COMPOSE_FLAGS}
COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE ?= test_acceptance_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_ACCEPTANCE = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE) $(DOCKER_COMPOSE)
COMPOSE_PROJECT_NAME_TEST_UNIT ?= test_unit_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_UNIT = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_UNIT) $(DOCKER_COMPOSE)
clean:
-docker rmi ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-docker rmi us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-$(DOCKER_COMPOSE_TEST_UNIT) down --rmi local
-$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down --rmi local
HERE=$(shell pwd)
MONOREPO=$(shell cd ../../ && pwd)
# Run the linting commands in the scope of the monorepo.
# Eslint and prettier (plus some configs) are on the root.
RUN_LINTING = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(HERE) node:20.18.2 npm run --silent
RUN_LINTING_CI = docker run --rm --volume $(MONOREPO)/.editorconfig:/overleaf/.editorconfig --volume $(MONOREPO)/.eslintignore:/overleaf/.eslintignore --volume $(MONOREPO)/.eslintrc:/overleaf/.eslintrc --volume $(MONOREPO)/.prettierignore:/overleaf/.prettierignore --volume $(MONOREPO)/.prettierrc:/overleaf/.prettierrc --volume $(MONOREPO)/tsconfig.backend.json:/overleaf/tsconfig.backend.json ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) npm run --silent
# Same but from the top of the monorepo
RUN_LINTING_MONOREPO = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(MONOREPO) node:20.18.2 npm run --silent
SHELLCHECK_OPTS = \
--shell=bash \
--external-sources
SHELLCHECK_COLOR := $(if $(CI),--color=never,--color)
SHELLCHECK_FILES := { git ls-files "*.sh" -z; git grep -Plz "\A\#\!.*bash"; } | sort -zu
shellcheck:
@$(SHELLCHECK_FILES) | xargs -0 -r docker run --rm -v $(HERE):/mnt -w /mnt \
koalaman/shellcheck:stable $(SHELLCHECK_OPTS) $(SHELLCHECK_COLOR)
shellcheck_fix:
@$(SHELLCHECK_FILES) | while IFS= read -r -d '' file; do \
diff=$$(docker run --rm -v $(HERE):/mnt -w /mnt koalaman/shellcheck:stable $(SHELLCHECK_OPTS) --format=diff "$$file" 2>/dev/null); \
if [ -n "$$diff" ] && ! echo "$$diff" | patch -p1 >/dev/null 2>&1; then echo "\033[31m$$file\033[0m"; \
elif [ -n "$$diff" ]; then echo "$$file"; \
else echo "\033[2m$$file\033[0m"; fi \
done
format:
$(RUN_LINTING) format
format_ci:
$(RUN_LINTING_CI) format
format_fix:
$(RUN_LINTING) format:fix
lint:
$(RUN_LINTING) lint
lint_ci:
$(RUN_LINTING_CI) lint
lint_fix:
$(RUN_LINTING) lint:fix
typecheck:
$(RUN_LINTING) types:check
typecheck_ci:
$(RUN_LINTING_CI) types:check
test: format lint typecheck shellcheck test_unit test_acceptance
test_unit:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) run --rm test_unit
$(MAKE) test_unit_clean
endif
test_clean: test_unit_clean
test_unit_clean:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) down -v -t 0
endif
test_acceptance: test_acceptance_clean test_acceptance_pre_run test_acceptance_run
$(MAKE) test_acceptance_clean
test_acceptance_debug: test_acceptance_clean test_acceptance_pre_run test_acceptance_run_debug
$(MAKE) test_acceptance_clean
test_acceptance_run:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance
endif
test_acceptance_run_debug:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run -p 127.0.0.9:19999:19999 --rm test_acceptance npm run test:acceptance -- --inspect=0.0.0.0:19999 --inspect-brk
endif
test_clean: test_acceptance_clean
test_acceptance_clean:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down -v -t 0
test_acceptance_pre_run:
ifneq (,$(wildcard test/acceptance/js/scripts/pre-run))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance test/acceptance/js/scripts/pre-run
endif
benchmarks:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance npm run benchmarks
build:
docker build \
--pull \
--build-arg BUILDKIT_INLINE_CACHE=1 \
--tag ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):main \
--file Dockerfile \
../..
tar:
$(DOCKER_COMPOSE) up tar
publish:
docker push $(DOCKER_REPO)/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
.PHONY: clean \
format format_fix \
lint lint_fix \
build_types typecheck \
lint_ci format_ci typecheck_ci \
shellcheck shellcheck_fix \
test test_clean test_unit test_unit_clean \
test_acceptance test_acceptance_debug test_acceptance_pre_run \
test_acceptance_run test_acceptance_run_debug test_acceptance_clean \
benchmarks \
build tar publish \

11
services/chat/README.md Normal file
View File

@@ -0,0 +1,11 @@
overleaf/chat
===============
The backend API that powers the chat service in Overleaf
License
-------
The code in this repository is released under the GNU AFFERO GENERAL PUBLIC LICENSE, version 3. A copy can be found in the `LICENSE` file.
Copyright (c) Overleaf, 2014-2019.

26
services/chat/app.js Normal file
View File

@@ -0,0 +1,26 @@
// Metrics must be initialized before importing anything else
import '@overleaf/metrics/initialize.js'
import logger from '@overleaf/logger'
import settings from '@overleaf/settings'
import { mongoClient } from './app/js/mongodb.js'
import { createServer } from './app/js/server.js'
const port = settings.internal.chat.port
const host = settings.internal.chat.host
mongoClient
.connect()
.then(async () => {
const { server } = await createServer()
server.listen(port, host, function (err) {
if (err) {
logger.fatal({ err }, `Cannot bind to ${host}:${port}. Exiting.`)
process.exit(1)
}
logger.debug(`Chat starting up, listening on ${host}:${port}`)
})
})
.catch(err => {
logger.fatal({ err }, 'Cannot connect to mongo. Exiting.')
process.exit(1)
})

View File

@@ -0,0 +1,60 @@
export function formatMessageForClientSide(message) {
if (message._id) {
message.id = message._id.toString()
delete message._id
}
const formattedMessage = {
id: message.id,
content: message.content,
timestamp: message.timestamp,
user_id: message.user_id,
}
if (message.edited_at) {
formattedMessage.edited_at = message.edited_at
}
return formattedMessage
}
export function formatMessagesForClientSide(messages) {
return messages.map(message => formatMessageForClientSide(message))
}
export function groupMessagesByThreads(rooms, messages) {
let room, thread
const roomsById = {}
for (room of rooms) {
roomsById[room._id.toString()] = room
}
const threads = {}
const getThread = function (room) {
const threadId = room.thread_id.toString()
if (threads[threadId]) {
return threads[threadId]
} else {
const thread = { messages: [] }
if (room.resolved) {
thread.resolved = true
thread.resolved_at = room.resolved.ts
thread.resolved_by_user_id = room.resolved.user_id
}
threads[threadId] = thread
return thread
}
}
for (const message of messages) {
room = roomsById[message.room_id.toString()]
if (room) {
thread = getThread(room)
thread.messages.push(formatMessageForClientSide(message))
}
}
for (const threadId in threads) {
thread = threads[threadId]
thread.messages.sort((a, b) => a.timestamp - b.timestamp)
}
return threads
}

View File

@@ -0,0 +1,313 @@
import logger from '@overleaf/logger'
import * as MessageManager from './MessageManager.js'
import * as MessageFormatter from './MessageFormatter.js'
import * as ThreadManager from '../Threads/ThreadManager.js'
import { ObjectId } from '../../mongodb.js'
const DEFAULT_MESSAGE_LIMIT = 50
const MAX_MESSAGE_LENGTH = 10 * 1024 // 10kb, about 1,500 words
function readContext(context, req) {
req.body = context.requestBody
req.params = context.params.path
req.query = context.params.query
if (typeof req.params.projectId !== 'undefined') {
if (!ObjectId.isValid(req.params.projectId)) {
context.res.status(400).setBody('Invalid projectId')
}
}
if (typeof req.params.threadId !== 'undefined') {
if (!ObjectId.isValid(req.params.threadId)) {
context.res.status(400).setBody('Invalid threadId')
}
}
}
/**
* @param context
* @param {(req: unknown, res: unknown) => Promise<unknown>} ControllerMethod
* @returns {Promise<*>}
*/
export async function callMessageHttpController(context, ControllerMethod) {
const req = {}
readContext(context, req)
if (context.res.statusCode !== 400) {
return await ControllerMethod(req, context.res)
} else {
return context.res.body
}
}
export async function getGlobalMessages(context) {
return await callMessageHttpController(context, _getGlobalMessages)
}
export async function sendGlobalMessage(context) {
return await callMessageHttpController(context, _sendGlobalMessage)
}
export async function sendMessage(context) {
return await callMessageHttpController(context, _sendThreadMessage)
}
export async function getThreads(context) {
return await callMessageHttpController(context, _getAllThreads)
}
export async function resolveThread(context) {
return await callMessageHttpController(context, _resolveThread)
}
export async function reopenThread(context) {
return await callMessageHttpController(context, _reopenThread)
}
export async function deleteThread(context) {
return await callMessageHttpController(context, _deleteThread)
}
export async function editMessage(context) {
return await callMessageHttpController(context, _editMessage)
}
export async function deleteMessage(context) {
return await callMessageHttpController(context, _deleteMessage)
}
export async function deleteUserMessage(context) {
return await callMessageHttpController(context, _deleteUserMessage)
}
export async function getResolvedThreadIds(context) {
return await callMessageHttpController(context, _getResolvedThreadIds)
}
export async function destroyProject(context) {
return await callMessageHttpController(context, _destroyProject)
}
export async function duplicateCommentThreads(context) {
return await callMessageHttpController(context, _duplicateCommentThreads)
}
export async function generateThreadData(context) {
return await callMessageHttpController(context, _generateThreadData)
}
export async function getStatus(context) {
const message = 'chat is alive'
context.res.status(200).setBody(message)
return message
}
const _getGlobalMessages = async (req, res) => {
await _getMessages(ThreadManager.GLOBAL_THREAD, req, res)
}
async function _sendGlobalMessage(req, res) {
const { user_id: userId, content } = req.body
const { projectId } = req.params
return await _sendMessage(
userId,
projectId,
content,
ThreadManager.GLOBAL_THREAD,
res
)
}
async function _sendThreadMessage(req, res) {
const { user_id: userId, content } = req.body
const { projectId, threadId } = req.params
return await _sendMessage(userId, projectId, content, threadId, res)
}
const _getAllThreads = async (req, res) => {
const { projectId } = req.params
logger.debug({ projectId }, 'getting all threads')
const rooms = await ThreadManager.findAllThreadRooms(projectId)
const roomIds = rooms.map(r => r._id)
const messages = await MessageManager.findAllMessagesInRooms(roomIds)
const threads = MessageFormatter.groupMessagesByThreads(rooms, messages)
res.json(threads)
}
const _generateThreadData = async (req, res) => {
const { projectId } = req.params
const { threads } = req.body
logger.debug({ projectId }, 'getting all threads')
const rooms = await ThreadManager.findThreadsById(projectId, threads)
const roomIds = rooms.map(r => r._id)
const messages = await MessageManager.findAllMessagesInRooms(roomIds)
logger.debug({ rooms, messages }, 'looked up messages in the rooms')
const threadData = MessageFormatter.groupMessagesByThreads(rooms, messages)
res.json(threadData)
}
const _resolveThread = async (req, res) => {
const { projectId, threadId } = req.params
const { user_id: userId } = req.body
logger.debug({ userId, projectId, threadId }, 'marking thread as resolved')
await ThreadManager.resolveThread(projectId, threadId, userId)
res.status(204)
}
const _reopenThread = async (req, res) => {
const { projectId, threadId } = req.params
logger.debug({ projectId, threadId }, 'reopening thread')
await ThreadManager.reopenThread(projectId, threadId)
res.status(204)
}
const _deleteThread = async (req, res) => {
const { projectId, threadId } = req.params
logger.debug({ projectId, threadId }, 'deleting thread')
const roomId = await ThreadManager.deleteThread(projectId, threadId)
await MessageManager.deleteAllMessagesInRoom(roomId)
res.status(204)
}
const _editMessage = async (req, res) => {
const { content, userId } = req.body
const { projectId, threadId, messageId } = req.params
logger.debug({ projectId, threadId, messageId, content }, 'editing message')
const room = await ThreadManager.findOrCreateThread(projectId, threadId)
const found = await MessageManager.updateMessage(
room._id,
messageId,
userId,
content,
Date.now()
)
if (!found) {
res.status(404)
return
}
res.status(204)
}
const _deleteMessage = async (req, res) => {
const { projectId, threadId, messageId } = req.params
logger.debug({ projectId, threadId, messageId }, 'deleting message')
const room = await ThreadManager.findOrCreateThread(projectId, threadId)
await MessageManager.deleteMessage(room._id, messageId)
res.status(204)
}
const _deleteUserMessage = async (req, res) => {
const { projectId, threadId, userId, messageId } = req.params
const room = await ThreadManager.findOrCreateThread(projectId, threadId)
await MessageManager.deleteUserMessage(userId, room._id, messageId)
res.status(204)
}
const _getResolvedThreadIds = async (req, res) => {
const { projectId } = req.params
const resolvedThreadIds = await ThreadManager.getResolvedThreadIds(projectId)
res.json({ resolvedThreadIds })
}
const _destroyProject = async (req, res) => {
const { projectId } = req.params
logger.debug({ projectId }, 'destroying project')
const rooms = await ThreadManager.findAllThreadRoomsAndGlobalThread(projectId)
const roomIds = rooms.map(r => r._id)
logger.debug({ projectId, roomIds }, 'deleting all messages in rooms')
await MessageManager.deleteAllMessagesInRooms(roomIds)
logger.debug({ projectId }, 'deleting all threads in project')
await ThreadManager.deleteAllThreadsInProject(projectId)
res.status(204)
}
async function _sendMessage(userId, projectId, content, clientThreadId, res) {
if (!ObjectId.isValid(userId)) {
const message = 'Invalid userId'
res.status(400).setBody(message)
return message
}
if (!content) {
const message = 'No content provided'
res.status(400).setBody(message)
return message
}
if (content.length > MAX_MESSAGE_LENGTH) {
const message = `Content too long (> ${MAX_MESSAGE_LENGTH} bytes)`
res.status(400).setBody(message)
return message
}
logger.debug(
{ clientThreadId, projectId, userId, content },
'new message received'
)
const thread = await ThreadManager.findOrCreateThread(
projectId,
clientThreadId
)
let message = await MessageManager.createMessage(
thread._id,
userId,
content,
Date.now()
)
message = MessageFormatter.formatMessageForClientSide(message)
message.room_id = projectId
res.status(201).setBody(message)
}
async function _getMessages(clientThreadId, req, res) {
let before, limit
const { projectId } = req.params
if (req.query.before) {
before = parseInt(req.query.before, 10)
} else {
before = null
}
if (req.query.limit) {
limit = parseInt(req.query.limit, 10)
} else {
limit = DEFAULT_MESSAGE_LIMIT
}
logger.debug(
{ limit, before, projectId, clientThreadId },
'get message request received'
)
const thread = await ThreadManager.findOrCreateThread(
projectId,
clientThreadId
)
const threadObjectId = thread._id
logger.debug(
{ limit, before, projectId, clientThreadId, threadObjectId },
'found or created thread'
)
let messages = await MessageManager.getMessages(threadObjectId, limit, before)
messages = MessageFormatter.formatMessagesForClientSide(messages)
logger.debug({ projectId, messages }, 'got messages')
res.status(200).setBody(messages)
}
async function _duplicateCommentThreads(req, res) {
const { projectId } = req.params
const { threads } = req.body
const result = {}
for (const id of threads) {
logger.debug({ projectId, thread: id }, 'duplicating thread')
try {
const { oldRoom, newRoom } = await ThreadManager.duplicateThread(
projectId,
id
)
await MessageManager.duplicateRoomToOtherRoom(oldRoom._id, newRoom._id)
result[id] = { duplicateId: newRoom.thread_id }
} catch (error) {
if (error instanceof ThreadManager.MissingThreadError) {
// Expected error when the comment has been deleted prior to duplication
result[id] = { error: 'not found' }
} else {
logger.err({ error }, 'error duplicating thread')
result[id] = { error: 'unknown' }
}
}
}
res.json({ newThreads: result })
}

View File

@@ -0,0 +1,112 @@
import { db, ObjectId } from '../../mongodb.js'
export async function createMessage(roomId, userId, content, timestamp) {
let newMessageOpts = {
content,
room_id: roomId,
user_id: userId,
timestamp,
}
newMessageOpts = _ensureIdsAreObjectIds(newMessageOpts)
const confirmation = await db.messages.insertOne(newMessageOpts)
newMessageOpts._id = confirmation.insertedId
return newMessageOpts
}
export async function getMessages(roomId, limit, before) {
let query = { room_id: roomId }
if (before) {
query.timestamp = { $lt: before }
}
query = _ensureIdsAreObjectIds(query)
return await db.messages
.find(query)
.sort({ timestamp: -1 })
.limit(limit)
.toArray()
}
export async function findAllMessagesInRooms(roomIds) {
return await db.messages
.find({
room_id: { $in: roomIds },
})
.toArray()
}
export async function deleteAllMessagesInRoom(roomId) {
await db.messages.deleteMany({
room_id: roomId,
})
}
export async function deleteAllMessagesInRooms(roomIds) {
await db.messages.deleteMany({
room_id: { $in: roomIds },
})
}
export async function updateMessage(
roomId,
messageId,
userId,
content,
timestamp
) {
const query = _ensureIdsAreObjectIds({
_id: messageId,
room_id: roomId,
})
if (userId) {
query.user_id = new ObjectId(userId)
}
const res = await db.messages.updateOne(query, {
$set: {
content,
edited_at: timestamp,
},
})
return res.modifiedCount === 1
}
export async function deleteMessage(roomId, messageId) {
const query = _ensureIdsAreObjectIds({
_id: messageId,
room_id: roomId,
})
await db.messages.deleteOne(query)
}
export async function deleteUserMessage(userId, roomId, messageId) {
await db.messages.deleteOne({
_id: new ObjectId(messageId),
user_id: new ObjectId(userId),
room_id: new ObjectId(roomId),
})
}
function _ensureIdsAreObjectIds(query) {
if (query.user_id && !(query.user_id instanceof ObjectId)) {
query.user_id = new ObjectId(query.user_id)
}
if (query.room_id && !(query.room_id instanceof ObjectId)) {
query.room_id = new ObjectId(query.room_id)
}
if (query._id && !(query._id instanceof ObjectId)) {
query._id = new ObjectId(query._id)
}
return query
}
export async function duplicateRoomToOtherRoom(sourceRoomId, targetRoomId) {
const sourceMessages = await findAllMessagesInRooms([sourceRoomId])
const targetMessages = sourceMessages.map(comment => {
return _ensureIdsAreObjectIds({
room_id: targetRoomId,
content: comment.content,
timestamp: comment.timestamp,
user_id: comment.user_id,
})
})
await db.messages.insertMany(targetMessages)
}

View File

@@ -0,0 +1,157 @@
import { db, ObjectId } from '../../mongodb.js'
export class MissingThreadError extends Error {}
export const GLOBAL_THREAD = 'GLOBAL'
export async function findOrCreateThread(projectId, threadId) {
let query, update
projectId = new ObjectId(projectId.toString())
if (threadId !== GLOBAL_THREAD) {
threadId = new ObjectId(threadId.toString())
}
if (threadId === GLOBAL_THREAD) {
query = {
project_id: projectId,
thread_id: { $exists: false },
}
update = {
project_id: projectId,
}
} else {
query = {
project_id: projectId,
thread_id: threadId,
}
update = {
project_id: projectId,
thread_id: threadId,
}
}
const result = await db.rooms.findOneAndUpdate(
query,
{ $set: update },
{ upsert: true, returnDocument: 'after' }
)
return result
}
export async function findAllThreadRooms(projectId) {
return await db.rooms
.find(
{
project_id: new ObjectId(projectId.toString()),
thread_id: { $exists: true },
},
{
thread_id: 1,
resolved: 1,
}
)
.toArray()
}
export async function findAllThreadRoomsAndGlobalThread(projectId) {
return await db.rooms
.find(
{
project_id: new ObjectId(projectId.toString()),
},
{
thread_id: 1,
resolved: 1,
}
)
.toArray()
}
export async function resolveThread(projectId, threadId, userId) {
await db.rooms.updateOne(
{
project_id: new ObjectId(projectId.toString()),
thread_id: new ObjectId(threadId.toString()),
},
{
$set: {
resolved: {
user_id: userId,
ts: new Date(),
},
},
}
)
}
export async function reopenThread(projectId, threadId) {
await db.rooms.updateOne(
{
project_id: new ObjectId(projectId.toString()),
thread_id: new ObjectId(threadId.toString()),
},
{
$unset: {
resolved: true,
},
}
)
}
export async function deleteThread(projectId, threadId) {
const room = await findOrCreateThread(projectId, threadId)
await db.rooms.deleteOne({
_id: room._id,
})
return room._id
}
export async function deleteAllThreadsInProject(projectId) {
await db.rooms.deleteMany({
project_id: new ObjectId(projectId.toString()),
})
}
export async function getResolvedThreadIds(projectId) {
const resolvedThreadIds = await db.rooms
.find(
{
project_id: new ObjectId(projectId),
thread_id: { $exists: true },
resolved: { $exists: true },
},
{ projection: { thread_id: 1 } }
)
.map(record => record.thread_id.toString())
.toArray()
return resolvedThreadIds
}
export async function duplicateThread(projectId, threadId) {
const room = await db.rooms.findOne({
project_id: new ObjectId(projectId),
thread_id: new ObjectId(threadId),
})
if (!room) {
throw new MissingThreadError('Trying to duplicate a non-existent thread')
}
const newRoom = {
project_id: room.project_id,
thread_id: new ObjectId(),
}
if (room.resolved) {
newRoom.resolved = room.resolved
}
const confirmation = await db.rooms.insertOne(newRoom)
newRoom._id = confirmation.insertedId
return { oldRoom: room, newRoom }
}
export async function findThreadsById(projectId, threadIds) {
return await db.rooms
.find({
project_id: new ObjectId(projectId),
thread_id: { $in: threadIds.map(id => new ObjectId(id)) },
})
.toArray()
}

View File

@@ -0,0 +1,18 @@
import Metrics from '@overleaf/metrics'
import Settings from '@overleaf/settings'
import { MongoClient } from 'mongodb'
export { ObjectId } from 'mongodb'
export const mongoClient = new MongoClient(
Settings.mongo.url,
Settings.mongo.options
)
const mongoDb = mongoClient.db()
export const db = {
messages: mongoDb.collection('messages'),
rooms: mongoDb.collection('rooms'),
}
Metrics.mongodb.monitor(mongoClient)

View File

@@ -0,0 +1,51 @@
import http from 'node:http'
import metrics from '@overleaf/metrics'
import logger from '@overleaf/logger'
import express from 'express'
import exegesisExpress from 'exegesis-express'
import path from 'node:path'
import { fileURLToPath } from 'node:url'
import * as messagesController from './Features/Messages/MessageHttpController.js'
const __dirname = fileURLToPath(new URL('.', import.meta.url))
logger.initialize('chat')
metrics.open_sockets.monitor()
metrics.leaked_sockets.monitor(logger)
export async function createServer() {
const app = express()
app.use(metrics.http.monitor(logger))
metrics.injectMetricsRoute(app)
// See https://github.com/exegesis-js/exegesis/blob/master/docs/Options.md
const options = {
controllers: { messagesController },
ignoreServers: true,
allowMissingControllers: false,
}
// const exegesisMiddleware = await exegesisExpress.middleware(
const exegesisMiddleware = await exegesisExpress.middleware(
path.resolve(__dirname, '../../chat.yaml'),
options
)
// If you have any body parsers, this should go before them.
app.use(exegesisMiddleware)
// Return a 404
app.use((req, res) => {
res.status(404).json({ message: `Not found` })
})
// Handle any unexpected errors
app.use((err, req, res, next) => {
res.status(500).json({ message: `Internal error: ${err.message}` })
})
const server = http.createServer(app)
return { app, server }
}

View File

@@ -0,0 +1,10 @@
/**
* Transform an async function into an Express middleware
*
* Any error will be passed to the error middlewares via `next()`
*/
export function expressify(fn) {
return (req, res, next) => {
fn(req, res, next).catch(next)
}
}

View File

@@ -0,0 +1,9 @@
chat
--dependencies=mongo
--docker-repos=us-east1-docker.pkg.dev/overleaf-ops/ol-docker
--env-add=
--env-pass-through=
--esmock-loader=False
--node-version=20.18.2
--public-repo=False
--script-version=4.7.0

416
services/chat/chat.yaml Normal file
View File

@@ -0,0 +1,416 @@
openapi: 3.1.0
x-stoplight:
id: okoe8mh50pjec
info:
title: chat
version: '1.0'
servers:
- url: 'http://chat:3010'
x-exegesis-controller: messagesController
paths:
'/project/{projectId}/messages':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
get:
summary: Get Global messages
tags: []
responses:
'201':
description: OK
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Message'
operationId: getGlobalMessages
description: Get global messages for the project with Project ID provided
parameters:
- schema:
type: string
in: query
name: before
- schema:
type: string
in: query
name: limit
post:
summary: Send Global message
operationId: sendGlobalMessage
responses:
'201':
description: OK
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/Message'
examples:
example-1:
value:
user_id: string
content: string
description: 'UserID and Content of the message to be posted. '
description: Send global message for the project with Project ID provided
'/project/{projectId}/thread/{threadId}/messages':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
post:
summary: Send message
operationId: sendMessage
responses:
'201':
description: Created
description: Add a message to the thread with thread ID provided from the Project with Project ID provided.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/Message'
description: |-
JSON object with :
- user_id: Id of the user
- content: Content of the message
'/project/{projectId}/threads':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
get:
summary: Get Threads
tags: []
responses:
'200':
description: OK
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Thread'
examples: {}
'404':
description: Not Found
operationId: getThreads
description: Get the list of threads for the project with Project ID provided
'/project/{projectId}/thread/{threadId}/messages/{messageId}/edit':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
- schema:
type: string
name: messageId
in: path
required: true
post:
summary: Edit message
operationId: editMessage
responses:
'204':
description: No Content
'404':
description: Not Found
requestBody:
content:
application/json:
schema:
type: object
properties:
content:
type: string
user_id:
type: string
readOnly: true
required:
- content
examples: {}
description: |-
JSON object with :
- content: Content of the message to edit
- user_id: Id of the user (optional)
description: |
Update message with Message ID provided from the Thread ID and Project ID provided
'/project/{projectId}/thread/{threadId}/messages/{messageId}':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
- schema:
type: string
name: messageId
in: path
required: true
delete:
summary: Delete message
operationId: deleteMessage
responses:
'204':
description: No Content
description: 'Delete message with Message ID provided, from the Thread with ThreadID and ProjectID provided'
'/project/{projectId}/thread/{threadId}/user/{userId}/messages/{messageId}':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
- schema:
type: string
name: userId
in: path
required: true
- schema:
type: string
name: messageId
in: path
required: true
delete:
summary: Delete message written by a given user
operationId: deleteUserMessage
responses:
'204':
description: No Content
'/project/{projectId}/thread/{threadId}/resolve':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
post:
summary: Resolve Thread
operationId: resolveThread
responses:
'204':
description: No Content
requestBody:
content:
application/json:
schema:
type: object
properties:
user_id:
type: string
required:
- user_id
description: |-
JSON object with :
- user_id: Id of the user.
description: Mark Thread with ThreadID and ProjectID provided owned by the user with UserID provided as resolved.
'/project/{projectId}/thread/{threadId}/reopen':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
post:
summary: Reopen Thread
operationId: reopenThread
responses:
'204':
description: No Content
description: |-
Reopen Thread with ThreadID and ProjectID provided.
i.e unmark it as resolved.
'/project/{projectId}/thread/{threadId}':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
- schema:
type: string
name: threadId
in: path
required: true
delete:
summary: Delete thread
operationId: deleteThread
responses:
'204':
description: No Content
description: Delete thread with ThreadID and ProjectID provided
'/project/{projectId}/resolved-thread-ids':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
get:
summary: Get resolved thread ids
operationId: getResolvedThreadIds
responses:
'200':
description: Resolved thread ids
'/project/{projectId}':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
delete:
summary: Destroy project
operationId: destroyProject
responses:
'204':
description: No Content
description: 'Delete all threads from Project with Project ID provided, and all messages in those threads.'
/status:
get:
summary: Check status
tags: []
responses:
'200':
description: OK
content:
application/json:
schema:
type: string
description: chat is alive
operationId: getStatus
description: Check that the Chat service is alive
head:
summary: Check status
tags: []
responses:
'200':
description: OK
content:
application/json:
schema:
type: string
description: chat is alive
operationId: getStatus
description: Check that the Chat service is alive
'/project/{projectId}/duplicate-comment-threads':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
post:
summary: Duplicate comment threads
operationId: duplicateCommentThreads
requestBody:
content:
application/json:
schema:
type: object
properties:
threads:
type: array
items:
type: string
responses:
'200':
content:
application/json:
schema:
type: object
properties:
newThreads:
type: object
description: Mapping of old thread ids to their duplicated thread ids
description: Duplicate a list of comment threads
'/project/{projectId}/generate-thread-data':
parameters:
- schema:
type: string
name: projectId
in: path
required: true
post:
summary: Generate thread data to load into the frontend
operationId: generateThreadData
requestBody:
content:
application/json:
schema:
type: object
properties:
threads:
type: array
items:
type: string
responses:
'200':
content:
application/json:
schema:
type: object
description: Load threads and generate a json blob containing all messages in all the threads
components:
schemas:
Message:
title: Message
x-stoplight:
id: ue9n1vvezlutw
type: object
examples:
- user_id: string
- content: string
properties:
user_id:
type: string
content:
type: string
required:
- user_id
- content
Thread:
title: Thread
x-stoplight:
id: 0ppt3jw4h5bua
type: array
items:
$ref: '#/components/schemas/Message'

View File

@@ -0,0 +1,33 @@
const http = require('node:http')
const https = require('node:https')
http.globalAgent.keepAlive = false
https.globalAgent.keepAlive = false
module.exports = {
internal: {
chat: {
host: process.env.LISTEN_ADDRESS || '127.0.0.1',
port: 3010,
},
},
apis: {
web: {
url: `http://${process.env.WEB_HOST || '127.0.0.1'}:${
process.env.WEB_PORT || 3000
}`,
user: process.env.WEB_API_USER || 'overleaf',
pass: process.env.WEB_API_PASSWORD || 'password',
},
},
mongo: {
url:
process.env.MONGO_CONNECTION_STRING ||
`mongodb://${process.env.MONGO_HOST || '127.0.0.1'}/sharelatex`,
options: {
monitorCommands: true,
},
},
}

View File

@@ -0,0 +1,52 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
user: node
command: npm run test:unit:_run
environment:
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
test_acceptance:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
environment:
ELASTIC_SEARCH_DSN: es:9200
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
depends_on:
mongo:
condition: service_started
user: node
command: npm run test:acceptance
tar:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
volumes:
- ./:/tmp/build/
command: tar -czf /tmp/build/build.tar.gz --exclude=build.tar.gz --exclude-vcs .
user: root
mongo:
image: mongo:6.0.13
command: --replSet overleaf
volumes:
- ../../bin/shared/mongodb-init-replica-set.js:/docker-entrypoint-initdb.d/mongodb-init-replica-set.js
environment:
MONGO_INITDB_DATABASE: sharelatex
extra_hosts:
# Required when using the automatic database setup for initializing the
# replica set. This override is not needed when running the setup after
# starting up mongo.
- mongo:127.0.0.1

View File

@@ -0,0 +1,56 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
image: node:20.18.2
volumes:
- .:/overleaf/services/chat
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
working_dir: /overleaf/services/chat
environment:
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
command: npm run --silent test:unit
user: node
test_acceptance:
image: node:20.18.2
volumes:
- .:/overleaf/services/chat
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
working_dir: /overleaf/services/chat
environment:
ELASTIC_SEARCH_DSN: es:9200
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
user: node
depends_on:
mongo:
condition: service_started
command: npm run --silent test:acceptance
mongo:
image: mongo:6.0.13
command: --replSet overleaf
volumes:
- ../../bin/shared/mongodb-init-replica-set.js:/docker-entrypoint-initdb.d/mongodb-init-replica-set.js
environment:
MONGO_INITDB_DATABASE: sharelatex
extra_hosts:
# Required when using the automatic database setup for initializing the
# replica set. This override is not needed when running the setup after
# starting up mongo.
- mongo:127.0.0.1

View File

@@ -0,0 +1,49 @@
{
"name": "@overleaf/chat",
"description": "The backend API that powers Overleaf chat",
"private": true,
"main": "app.js",
"type": "module",
"scripts": {
"start": "node app.js",
"test:acceptance": "npm run test:acceptance:_run -- --grep=$MOCHA_GREP",
"test:unit": "npm run test:unit:_run -- --grep=$MOCHA_GREP",
"nodemon": "node --watch app.js",
"test:acceptance:_run": "mocha --recursive --reporter spec --timeout 15000 --exit $@ test/acceptance/js",
"test:unit:_run": "mocha --recursive --reporter spec $@ test/unit/js",
"lint": "eslint --max-warnings 0 --format unix .",
"format": "prettier --list-different $PWD/'**/*.*js'",
"format:fix": "prettier --write $PWD/'**/*.*js'",
"lint:fix": "eslint --fix .",
"types:check": "tsc --noEmit"
},
"dependencies": {
"@overleaf/logger": "*",
"@overleaf/metrics": "*",
"@overleaf/settings": "*",
"async": "^3.2.5",
"body-parser": "^1.20.3",
"exegesis-express": "^4.0.0",
"express": "^4.21.2",
"mongodb": "6.12.0"
},
"devDependencies": {
"acorn": "^7.1.1",
"ajv": "^6.12.0",
"chai": "^4.3.6",
"chai-as-promised": "^7.1.1",
"mocha": "^11.1.0",
"request": "^2.88.2",
"sandboxed-module": "^2.0.4",
"sinon": "^9.2.4",
"timekeeper": "^2.2.0",
"typescript": "^5.0.4"
},
"version": "1.0.0",
"directories": {
"test": "test"
},
"keywords": [],
"author": "",
"license": "AGPL-3.0"
}

View File

@@ -0,0 +1,93 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
const user1Id = new ObjectId().toString()
const user2Id = new ObjectId().toString()
async function createCommentThread(projectId, threadId = new ObjectId()) {
const { response: response1 } = await ChatClient.sendMessage(
projectId,
threadId.toString(),
user1Id,
'message 1'
)
expect(response1.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.sendMessage(
projectId,
threadId,
user2Id,
'message 2'
)
expect(response2.statusCode).to.equal(201)
return threadId.toString()
}
describe('Cloning comment threads', async function () {
const projectId = new ObjectId().toString()
before(async function () {
await ChatApp.ensureRunning()
this.thread1Id = await createCommentThread(projectId)
this.thread2Id = await createCommentThread(projectId)
this.thread3Id = await createCommentThread(projectId)
})
describe('with non-orphaned threads', async function () {
before(async function () {
const {
response: { body: result, statusCode },
} = await ChatClient.duplicateCommentThreads(projectId, [this.thread3Id])
this.result = result
expect(statusCode).to.equal(200)
expect(this.result).to.have.property('newThreads')
this.newThreadId = this.result.newThreads[this.thread3Id].duplicateId
})
it('should duplicate threads', function () {
expect(this.result.newThreads).to.have.property(this.thread3Id)
expect(this.result.newThreads[this.thread3Id]).to.have.property(
'duplicateId'
)
expect(this.result.newThreads[this.thread3Id].duplicateId).to.not.equal(
this.thread3Id
)
})
it('should not duplicate other threads threads', function () {
expect(this.result.newThreads).to.not.have.property(this.thread1Id)
expect(this.result.newThreads).to.not.have.property(this.thread2Id)
})
it('should duplicate the messages in the thread', async function () {
const {
response: { body: threads },
} = await ChatClient.getThreads(projectId)
function ignoreId(comment) {
return {
...comment,
id: undefined,
}
}
expect(threads[this.thread3Id].messages.map(ignoreId)).to.deep.equal(
threads[this.newThreadId].messages.map(ignoreId)
)
})
it('should have two separate unlinked threads', async function () {
await ChatClient.sendMessage(
projectId,
this.newThreadId,
user1Id,
'third message'
)
const {
response: { body: threads },
} = await ChatClient.getThreads(projectId)
expect(threads[this.thread3Id].messages.length).to.equal(2)
expect(threads[this.newThreadId].messages.length).to.equal(3)
})
})
})

View File

@@ -0,0 +1,47 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
describe('Deleting a message', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
const threadId = new ObjectId().toString()
before(async function () {
await ChatApp.ensureRunning()
})
describe('in a thread', async function () {
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
'first message'
)
expect(response.statusCode).to.equal(201)
const { response: response2, body: message } =
await ChatClient.sendMessage(
projectId,
threadId,
userId,
'deleted message'
)
expect(response2.statusCode).to.equal(201)
const { response: response3 } = await ChatClient.deleteMessage(
projectId,
threadId,
message.id
)
expect(response3.statusCode).to.equal(204)
})
it('should then remove the message from the threads', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].messages.length).to.equal(1)
})
})
})

View File

@@ -0,0 +1,38 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
describe('Deleting a thread', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
before(async function () {
await ChatApp.ensureRunning()
})
describe('with a thread that is deleted', async function () {
const threadId = new ObjectId().toString()
const content = 'deleted thread message'
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.deleteThread(
projectId,
threadId
)
expect(response2.statusCode).to.equal(204)
})
it('should then not list the thread for the project', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(Object.keys(threads).length).to.equal(0)
})
})
})

View File

@@ -0,0 +1,66 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
const db = ChatApp.db
async function getMessage(messageId) {
return await db.messages.findOne({
_id: new ObjectId(messageId),
})
}
describe('Destroying a project', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
before(async function () {
await ChatApp.ensureRunning()
})
describe('with a project that has threads and messages', async function () {
const threadId = new ObjectId().toString()
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
'destroyed thread message'
)
expect(response.statusCode).to.equal(201)
this.threadMessageId = response.body.id
const { response: response2 } = await ChatClient.sendGlobalMessage(
projectId,
userId,
'destroyed global message'
)
expect(response2.statusCode).to.equal(201)
this.globalThreadMessageId = response2.body.id
const threadRooms = await db.rooms
.find({ project_id: new ObjectId(projectId) })
.toArray()
expect(threadRooms.length).to.equal(2)
const threadMessage = await getMessage(this.threadMessageId)
expect(threadMessage).to.exist
const globalThreadMessage = await getMessage(this.globalThreadMessageId)
expect(globalThreadMessage).to.exist
const { response: responseDestroy } =
await ChatClient.destroyProject(projectId)
expect(responseDestroy.statusCode).to.equal(204)
})
it('should remove the messages and threads from the database', async function () {
const threadRooms = await db.rooms
.find({ project_id: new ObjectId(projectId) })
.toArray()
expect(threadRooms.length).to.equal(0)
const threadMessage = await getMessage(this.threadMessageId)
expect(threadMessage).to.be.null
const globalThreadMessage = await getMessage(this.globalThreadMessageId)
expect(globalThreadMessage).to.be.null
})
})
})

View File

@@ -0,0 +1,96 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
describe('Editing a message', async function () {
let projectId, userId, threadId
before(async function () {
await ChatApp.ensureRunning()
})
describe('in a thread', async function () {
const content = 'thread message'
const newContent = 'updated thread message'
let messageId
beforeEach(async function () {
projectId = new ObjectId().toString()
userId = new ObjectId().toString()
threadId = new ObjectId().toString()
const { response, body: message } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
expect(message.id).to.exist
expect(message.content).to.equal(content)
messageId = message.id
})
describe('without user', function () {
beforeEach(async function () {
const { response } = await ChatClient.editMessage(
projectId,
threadId,
messageId,
newContent
)
expect(response.statusCode).to.equal(204)
})
it('should then list the updated message in the threads', async function () {
const { response, body: threads } =
await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].messages.length).to.equal(1)
expect(threads[threadId].messages[0].content).to.equal(newContent)
})
})
describe('with the same user', function () {
beforeEach(async function () {
const { response } = await ChatClient.editMessageWithUser(
projectId,
threadId,
messageId,
userId,
newContent
)
expect(response.statusCode).to.equal(204)
})
it('should then list the updated message in the threads', async function () {
const { response, body: threads } =
await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].messages.length).to.equal(1)
expect(threads[threadId].messages[0].content).to.equal(newContent)
})
})
describe('with another user', function () {
beforeEach(async function () {
const { response } = await ChatClient.editMessageWithUser(
projectId,
threadId,
messageId,
new ObjectId(),
newContent
)
expect(response.statusCode).to.equal(404)
})
it('should then list the old message in the threads', async function () {
const { response, body: threads } =
await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].messages.length).to.equal(1)
expect(threads[threadId].messages[0].content).to.equal(content)
})
})
})
})

View File

@@ -0,0 +1,164 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
async function getCount() {
return await ChatClient.getMetric(line => {
return (
line.includes('timer_http_request_count') &&
line.includes('path="project_{projectId}_messages"') &&
line.includes('method="POST"')
)
})
}
describe('Getting messages', async function () {
const userId1 = new ObjectId().toString()
const userId2 = new ObjectId().toString()
const content1 = 'foo bar'
const content2 = 'hello world'
before(async function () {
await ChatApp.ensureRunning()
})
describe('globally', async function () {
const projectId = new ObjectId().toString()
before(async function () {
const previousCount = await getCount()
const { response } = await ChatClient.sendGlobalMessage(
projectId,
userId1,
content1
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.sendGlobalMessage(
projectId,
userId2,
content2
)
expect(response2.statusCode).to.equal(201)
const { response: response3, body } = await ChatClient.checkStatus()
expect(response3.statusCode).to.equal(200)
expect(body).to.equal('chat is alive')
expect(await getCount()).to.equal(previousCount + 2)
})
it('should contain the messages and populated users when getting the messages', async function () {
const { response, body: messages } =
await ChatClient.getGlobalMessages(projectId)
expect(response.statusCode).to.equal(200)
expect(messages.length).to.equal(2)
messages.reverse()
expect(messages[0].content).to.equal(content1)
expect(messages[0].user_id).to.equal(userId1)
expect(messages[1].content).to.equal(content2)
expect(messages[1].user_id).to.equal(userId2)
})
})
describe('from all the threads', async function () {
const projectId = new ObjectId().toString()
const threadId1 = new ObjectId().toString()
const threadId2 = new ObjectId().toString()
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId1,
userId1,
'one'
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.sendMessage(
projectId,
threadId2,
userId2,
'two'
)
expect(response2.statusCode).to.equal(201)
const { response: response3 } = await ChatClient.sendMessage(
projectId,
threadId1,
userId1,
'three'
)
expect(response3.statusCode).to.equal(201)
const { response: response4 } = await ChatClient.sendMessage(
projectId,
threadId2,
userId2,
'four'
)
expect(response4.statusCode).to.equal(201)
})
it('should contain a dictionary of threads with messages with populated users', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(Object.keys(threads).length).to.equal(2)
const thread1 = threads[threadId1]
expect(thread1.messages.length).to.equal(2)
const thread2 = threads[threadId2]
expect(thread2.messages.length).to.equal(2)
expect(thread1.messages[0].content).to.equal('one')
expect(thread1.messages[0].user_id).to.equal(userId1)
expect(thread1.messages[1].content).to.equal('three')
expect(thread1.messages[1].user_id).to.equal(userId1)
expect(thread2.messages[0].content).to.equal('two')
expect(thread2.messages[0].user_id).to.equal(userId2)
expect(thread2.messages[1].content).to.equal('four')
expect(thread2.messages[1].user_id).to.equal(userId2)
})
})
describe('from a list of threads', function () {
const projectId = new ObjectId().toString()
const threadId1 = new ObjectId().toString()
const threadId2 = new ObjectId().toString()
const threadId3 = new ObjectId().toString()
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId1,
userId1,
'one'
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.sendMessage(
projectId,
threadId2,
userId2,
'two'
)
expect(response2.statusCode).to.equal(201)
const { response: response3 } = await ChatClient.sendMessage(
projectId,
threadId1,
userId1,
'three'
)
expect(response3.statusCode).to.equal(201)
})
it('should contain a dictionary of threads with messages with populated users', async function () {
const { response, body: threads } = await ChatClient.generateThreadData(
projectId,
[threadId1, threadId3]
)
expect(response.statusCode).to.equal(200)
expect(Object.keys(threads).length).to.equal(1)
const thread1 = threads[threadId1]
expect(thread1.messages.length).to.equal(2)
expect(thread1.messages[0].content).to.equal('one')
expect(thread1.messages[0].user_id).to.equal(userId1)
expect(thread1.messages[1].content).to.equal('three')
expect(thread1.messages[1].user_id).to.equal(userId1)
})
})
})

View File

@@ -0,0 +1,114 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
describe('Resolving a thread', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
before(async function () {
await ChatApp.ensureRunning()
})
describe('with a resolved thread', async function () {
const threadId = new ObjectId().toString()
const content = 'resolved message'
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.resolveThread(
projectId,
threadId,
userId
)
expect(response2.statusCode).to.equal(204)
})
it('should then list the thread as resolved', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].resolved).to.equal(true)
expect(threads[threadId].resolved_by_user_id).to.equal(userId)
const resolvedAt = new Date(threads[threadId].resolved_at)
expect(new Date() - resolvedAt).to.be.below(1000)
})
it('should list the thread id in the resolved thread ids endpoint', async function () {
const { response, body } =
await ChatClient.getResolvedThreadIds(projectId)
expect(response.statusCode).to.equal(200)
expect(body.resolvedThreadIds).to.include(threadId)
})
})
describe('when a thread is not resolved', async function () {
const threadId = new ObjectId().toString()
const content = 'open message'
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
})
it('should not list the thread as resolved', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].resolved).to.be.undefined
})
it('should not list the thread in the resolved thread ids endpoint', async function () {
const { response, body } =
await ChatClient.getResolvedThreadIds(projectId)
expect(response.statusCode).to.equal(200)
expect(body.resolvedThreadIds).not.to.include(threadId)
})
})
describe('when a thread is resolved then reopened', async function () {
const threadId = new ObjectId().toString()
const content = 'resolved message'
before(async function () {
const { response } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
const { response: response2 } = await ChatClient.resolveThread(
projectId,
threadId,
userId
)
expect(response2.statusCode).to.equal(204)
const { response: response3 } = await ChatClient.reopenThread(
projectId,
threadId
)
expect(response3.statusCode).to.equal(204)
})
it('should not list the thread as resolved', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].resolved).to.be.undefined
})
it('should not list the thread in the resolved thread ids endpoint', async function () {
const { response, body } =
await ChatClient.getResolvedThreadIds(projectId)
expect(response.statusCode).to.equal(200)
expect(body.resolvedThreadIds).not.to.include(threadId)
})
})
})

View File

@@ -0,0 +1,143 @@
import { ObjectId } from '../../../app/js/mongodb.js'
import { expect } from 'chai'
import * as ChatClient from './helpers/ChatClient.js'
import * as ChatApp from './helpers/ChatApp.js'
describe('Sending a message', async function () {
before(async function () {
await ChatApp.ensureRunning()
})
describe('globally', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
const content = 'global message'
before(async function () {
const { response, body } = await ChatClient.sendGlobalMessage(
projectId,
userId,
content
)
expect(response.statusCode).to.equal(201)
expect(body.content).to.equal(content)
expect(body.user_id).to.equal(userId)
expect(body.room_id).to.equal(projectId)
})
it('should then list the message in the project messages', async function () {
const { response, body: messages } =
await ChatClient.getGlobalMessages(projectId)
expect(response.statusCode).to.equal(200)
expect(messages.length).to.equal(1)
expect(messages[0].content).to.equal(content)
})
})
describe('to a thread', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
const threadId = new ObjectId().toString()
const content = 'thread message'
before(async function () {
const { response, body } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(201)
expect(body.content).to.equal(content)
expect(body.user_id).to.equal(userId)
expect(body.room_id).to.equal(projectId)
})
it('should then list the message in the threads', async function () {
const { response, body: threads } = await ChatClient.getThreads(projectId)
expect(response.statusCode).to.equal(200)
expect(threads[threadId].messages.length).to.equal(1)
expect(threads[threadId].messages[0].content).to.equal(content)
})
it('should not appear in the global messages', async function () {
const { response, body: messages } =
await ChatClient.getGlobalMessages(projectId)
expect(response.statusCode).to.equal(200)
expect(messages.length).to.equal(0)
})
})
describe('failure cases', async function () {
const projectId = new ObjectId().toString()
const userId = new ObjectId().toString()
const threadId = new ObjectId().toString()
describe('with a malformed userId', async function () {
it('should return a graceful error', async function () {
const { response, body } = await ChatClient.sendMessage(
projectId,
threadId,
'malformed-user',
'content'
)
expect(response.statusCode).to.equal(400)
expect(body).to.equal('Invalid userId')
})
})
describe('with a malformed projectId', async function () {
it('should return a graceful error', async function () {
const { response, body } = await ChatClient.sendMessage(
'malformed-project',
threadId,
userId,
'content'
)
expect(response.statusCode).to.equal(400)
expect(body).to.equal('Invalid projectId')
})
})
describe('with a malformed threadId', async function () {
it('should return a graceful error', async function () {
const { response, body } = await ChatClient.sendMessage(
projectId,
'malformed-thread-id',
userId,
'content'
)
expect(response.statusCode).to.equal(400)
expect(body).to.equal('Invalid threadId')
})
})
describe('with no content', async function () {
it('should return a graceful error', async function () {
const { response, body } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
null
)
expect(response.statusCode).to.equal(400)
// Exegesis is responding with validation errors. I can´t find a way to choose the validation error yet.
// expect(body).to.equal('No content provided')
expect(body.message).to.equal('Validation errors')
})
})
describe('with very long content', async function () {
it('should return a graceful error', async function () {
const content = '-'.repeat(10 * 1024 + 1)
const { response, body } = await ChatClient.sendMessage(
projectId,
threadId,
userId,
content
)
expect(response.statusCode).to.equal(400)
expect(body).to.equal('Content too long (> 10240 bytes)')
})
})
})
})

View File

@@ -0,0 +1,15 @@
import { createServer } from '../../../../app/js/server.js'
import { promisify } from 'node:util'
export { db } from '../../../../app/js/mongodb.js'
let serverPromise = null
export async function ensureRunning() {
if (!serverPromise) {
const { app } = await createServer()
const startServer = promisify(app.listen.bind(app))
serverPromise = startServer(3010, '127.0.0.1')
}
return serverPromise
}

View File

@@ -0,0 +1,166 @@
import Request from 'request'
const request = Request.defaults({
baseUrl: 'http://127.0.0.1:3010',
})
async function asyncRequest(options) {
return await new Promise((resolve, reject) => {
request(options, (err, response, body) => {
if (err) {
reject(err)
} else {
resolve({ response, body })
}
})
})
}
export async function sendGlobalMessage(projectId, userId, content) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/messages`,
json: {
user_id: userId,
content,
},
})
}
export async function getGlobalMessages(projectId) {
return await asyncRequest({
method: 'get',
url: `/project/${projectId}/messages`,
json: true,
})
}
export async function sendMessage(projectId, threadId, userId, content) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/thread/${threadId}/messages`,
json: {
user_id: userId,
content,
},
})
}
export async function getThreads(projectId) {
return await asyncRequest({
method: 'get',
url: `/project/${projectId}/threads`,
json: true,
})
}
export async function resolveThread(projectId, threadId, userId) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/thread/${threadId}/resolve`,
json: {
user_id: userId,
},
})
}
export async function getResolvedThreadIds(projectId) {
return await asyncRequest({
method: 'get',
url: `/project/${projectId}/resolved-thread-ids`,
json: true,
})
}
export async function editMessage(projectId, threadId, messageId, content) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/thread/${threadId}/messages/${messageId}/edit`,
json: {
content,
},
})
}
export async function editMessageWithUser(
projectId,
threadId,
messageId,
userId,
content
) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/thread/${threadId}/messages/${messageId}/edit`,
json: {
content,
userId,
},
})
}
export async function checkStatus() {
return await asyncRequest({
method: 'get',
url: `/status`,
json: true,
})
}
export async function getMetric(matcher) {
const { body } = await asyncRequest({
method: 'get',
url: `/metrics`,
})
const found = body.split('\n').find(matcher)
if (!found) return 0
return parseInt(found.split(' ')[1], 0)
}
export async function reopenThread(projectId, threadId) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/thread/${threadId}/reopen`,
})
}
export async function deleteThread(projectId, threadId) {
return await asyncRequest({
method: 'delete',
url: `/project/${projectId}/thread/${threadId}`,
})
}
export async function deleteMessage(projectId, threadId, messageId) {
return await asyncRequest({
method: 'delete',
url: `/project/${projectId}/thread/${threadId}/messages/${messageId}`,
})
}
export async function destroyProject(projectId) {
return await asyncRequest({
method: 'delete',
url: `/project/${projectId}`,
})
}
export async function duplicateCommentThreads(projectId, threads) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/duplicate-comment-threads`,
json: {
threads,
},
})
}
export async function generateThreadData(projectId, threads) {
return await asyncRequest({
method: 'post',
url: `/project/${projectId}/generate-thread-data`,
json: {
threads,
},
})
}

View File

@@ -0,0 +1,9 @@
import chai from 'chai'
import chaiAsPromised from 'chai-as-promised'
import { ObjectId } from 'mongodb'
// ensure every ObjectId has the id string as a property for correct comparisons
ObjectId.cacheHexString = true
chai.should()
chai.use(chaiAsPromised)

View File

@@ -0,0 +1,12 @@
{
"extends": "../../tsconfig.backend.json",
"include": [
"app.js",
"app/js/**/*",
"benchmarks/**/*",
"config/**/*",
"scripts/**/*",
"test/**/*",
"types"
]
}

14
services/clsi/.gitignore vendored Normal file
View File

@@ -0,0 +1,14 @@
**.swp
node_modules
test/acceptance/fixtures/tmp
compiles
output
.DS_Store
*~
cache
.vagrant
config/*
npm-debug.log
# managed by dev-environment$ bin/update_build_scripts
.npmrc

View File

@@ -0,0 +1,3 @@
{
"require": "test/setup.js"
}

1
services/clsi/.nvmrc Normal file
View File

@@ -0,0 +1 @@
20.18.2

35
services/clsi/.viminfo Normal file
View File

@@ -0,0 +1,35 @@
# This viminfo file was generated by Vim 7.4.
# You may edit it if you're careful!
# Value of 'encoding' when this file was written
*encoding=latin1
# hlsearch on (H) or off (h):
~h
# Command Line History (newest to oldest):
:x
# Search String History (newest to oldest):
# Expression History (newest to oldest):
# Input Line History (newest to oldest):
# Input Line History (newest to oldest):
# Registers:
# File marks:
'0 1 0 ~/hello
# Jumplist (newest first):
-' 1 0 ~/hello
# History of marks within files (newest to oldest):
> ~/hello
" 1 0
^ 1 1
. 1 0
+ 1 0

32
services/clsi/Dockerfile Normal file
View File

@@ -0,0 +1,32 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
FROM node:20.18.2 AS base
WORKDIR /overleaf/services/clsi
COPY services/clsi/install_deps.sh /overleaf/services/clsi/
RUN chmod 0755 ./install_deps.sh && ./install_deps.sh
ENTRYPOINT ["/bin/sh", "/entrypoint.sh"]
COPY services/clsi/entrypoint.sh /
# Google Cloud Storage needs a writable $HOME/.config for resumable uploads
# (see https://googleapis.dev/nodejs/storage/latest/File.html#createWriteStream)
RUN mkdir /home/node/.config && chown node:node /home/node/.config
FROM base AS app
COPY package.json package-lock.json /overleaf/
COPY services/clsi/package.json /overleaf/services/clsi/
COPY libraries/ /overleaf/libraries/
COPY patches/ /overleaf/patches/
RUN cd /overleaf && npm ci --quiet
COPY services/clsi/ /overleaf/services/clsi/
FROM app
RUN mkdir -p cache compiles output \
&& chown node:node cache compiles output
CMD ["node", "--expose-gc", "app.js"]

661
services/clsi/LICENSE Normal file
View File

@@ -0,0 +1,661 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<http://www.gnu.org/licenses/>.

158
services/clsi/Makefile Normal file
View File

@@ -0,0 +1,158 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
BUILD_NUMBER ?= local
BRANCH_NAME ?= $(shell git rev-parse --abbrev-ref HEAD)
PROJECT_NAME = clsi
BUILD_DIR_NAME = $(shell pwd | xargs basename | tr -cd '[a-zA-Z0-9_.\-]')
DOCKER_COMPOSE_FLAGS ?= -f docker-compose.yml
DOCKER_COMPOSE := BUILD_NUMBER=$(BUILD_NUMBER) \
BRANCH_NAME=$(BRANCH_NAME) \
PROJECT_NAME=$(PROJECT_NAME) \
MOCHA_GREP=${MOCHA_GREP} \
docker compose ${DOCKER_COMPOSE_FLAGS}
COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE ?= test_acceptance_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_ACCEPTANCE = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE) $(DOCKER_COMPOSE)
COMPOSE_PROJECT_NAME_TEST_UNIT ?= test_unit_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_UNIT = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_UNIT) $(DOCKER_COMPOSE)
clean:
-docker rmi ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-docker rmi gcr.io/overleaf-ops/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-docker rmi us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-$(DOCKER_COMPOSE_TEST_UNIT) down --rmi local
-$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down --rmi local
HERE=$(shell pwd)
MONOREPO=$(shell cd ../../ && pwd)
# Run the linting commands in the scope of the monorepo.
# Eslint and prettier (plus some configs) are on the root.
RUN_LINTING = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(HERE) node:20.18.2 npm run --silent
RUN_LINTING_CI = docker run --rm --volume $(MONOREPO)/.editorconfig:/overleaf/.editorconfig --volume $(MONOREPO)/.eslintignore:/overleaf/.eslintignore --volume $(MONOREPO)/.eslintrc:/overleaf/.eslintrc --volume $(MONOREPO)/.prettierignore:/overleaf/.prettierignore --volume $(MONOREPO)/.prettierrc:/overleaf/.prettierrc --volume $(MONOREPO)/tsconfig.backend.json:/overleaf/tsconfig.backend.json ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) npm run --silent
# Same but from the top of the monorepo
RUN_LINTING_MONOREPO = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(MONOREPO) node:20.18.2 npm run --silent
SHELLCHECK_OPTS = \
--shell=bash \
--external-sources
SHELLCHECK_COLOR := $(if $(CI),--color=never,--color)
SHELLCHECK_FILES := { git ls-files "*.sh" -z; git grep -Plz "\A\#\!.*bash"; } | sort -zu
shellcheck:
@$(SHELLCHECK_FILES) | xargs -0 -r docker run --rm -v $(HERE):/mnt -w /mnt \
koalaman/shellcheck:stable $(SHELLCHECK_OPTS) $(SHELLCHECK_COLOR)
shellcheck_fix:
@$(SHELLCHECK_FILES) | while IFS= read -r -d '' file; do \
diff=$$(docker run --rm -v $(HERE):/mnt -w /mnt koalaman/shellcheck:stable $(SHELLCHECK_OPTS) --format=diff "$$file" 2>/dev/null); \
if [ -n "$$diff" ] && ! echo "$$diff" | patch -p1 >/dev/null 2>&1; then echo "\033[31m$$file\033[0m"; \
elif [ -n "$$diff" ]; then echo "$$file"; \
else echo "\033[2m$$file\033[0m"; fi \
done
format:
$(RUN_LINTING) format
format_ci:
$(RUN_LINTING_CI) format
format_fix:
$(RUN_LINTING) format:fix
lint:
$(RUN_LINTING) lint
lint_ci:
$(RUN_LINTING_CI) lint
lint_fix:
$(RUN_LINTING) lint:fix
typecheck:
$(RUN_LINTING) types:check
typecheck_ci:
$(RUN_LINTING_CI) types:check
test: format lint typecheck shellcheck test_unit test_acceptance
test_unit:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) run --rm test_unit
$(MAKE) test_unit_clean
endif
test_clean: test_unit_clean
test_unit_clean:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) down -v -t 0
endif
test_acceptance: test_acceptance_clean test_acceptance_pre_run test_acceptance_run
$(MAKE) test_acceptance_clean
test_acceptance_debug: test_acceptance_clean test_acceptance_pre_run test_acceptance_run_debug
$(MAKE) test_acceptance_clean
test_acceptance_run:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance
endif
test_acceptance_run_debug:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run -p 127.0.0.9:19999:19999 --rm test_acceptance npm run test:acceptance -- --inspect=0.0.0.0:19999 --inspect-brk
endif
test_clean: test_acceptance_clean
test_acceptance_clean:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down -v -t 0
test_acceptance_pre_run:
ifneq (,$(wildcard test/acceptance/js/scripts/pre-run))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance test/acceptance/js/scripts/pre-run
endif
benchmarks:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance npm run benchmarks
build:
docker build \
--pull \
--build-arg BUILDKIT_INLINE_CACHE=1 \
--tag ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag gcr.io/overleaf-ops/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag gcr.io/overleaf-ops/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from gcr.io/overleaf-ops/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from gcr.io/overleaf-ops/$(PROJECT_NAME):main \
--tag us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--file Dockerfile \
../..
tar:
$(DOCKER_COMPOSE) up tar
publish:
docker push $(DOCKER_REPO)/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
.PHONY: clean \
format format_fix \
lint lint_fix \
build_types typecheck \
lint_ci format_ci typecheck_ci \
shellcheck shellcheck_fix \
test test_clean test_unit test_unit_clean \
test_acceptance test_acceptance_debug test_acceptance_pre_run \
test_acceptance_run test_acceptance_run_debug test_acceptance_clean \
benchmarks \
build tar publish \

188
services/clsi/README.md Normal file
View File

@@ -0,0 +1,188 @@
overleaf/clsi
===============
A web api for compiling LaTeX documents in the cloud
The Common LaTeX Service Interface (CLSI) provides a RESTful interface to traditional LaTeX tools (or, more generally, any command line tool for composing marked-up documents into a display format such as PDF or HTML). The CLSI listens on the following ports by default:
* TCP/3013 - the RESTful interface
* TCP/3048 - reports load information
* TCP/3049 - HTTP interface to control the CLSI service
These defaults can be modified in `config/settings.defaults.js`.
The provided `Dockerfile` builds a Docker image which has the Docker command line tools installed. The configuration in `docker-compose-config.yml` mounts the Docker socket, in order that the CLSI container can talk to the Docker host it is running in. This allows it to spin up `sibling containers` running an image with a TeX distribution installed to perform the actual compiles.
The CLSI can be configured through the following environment variables:
* `ALLOWED_COMPILE_GROUPS` - Space separated list of allowed compile groups
* `ALLOWED_IMAGES` - Space separated list of allowed Docker TeX Live images
* `CATCH_ERRORS` - Set to `true` to log uncaught exceptions
* `COMPILE_GROUP_DOCKER_CONFIGS` - JSON string of Docker configs for compile groups
* `COMPILES_HOST_DIR` - Working directory for LaTeX compiles
* `OUTPUT_HOST_DIR` - Output directory for LaTeX compiles
* `COMPILE_SIZE_LIMIT` - Sets the body-parser [limit](https://github.com/expressjs/body-parser#limit)
* `DOCKER_RUNNER` - Set to true to use sibling containers
* `DOCKER_RUNTIME` -
* `FILESTORE_DOMAIN_OVERRIDE` - The url for the filestore service e.g.`http://$FILESTORE_HOST:3009`
* `FILESTORE_PARALLEL_FILE_DOWNLOADS` - Number of parallel file downloads
* `LISTEN_ADDRESS` - The address for the RESTful service to listen on. Set to `0.0.0.0` to listen on all network interfaces
* `PROCESS_LIFE_SPAN_LIMIT_MS` - Process life span limit in milliseconds
* `SMOKE_TEST` - Whether to run smoke tests
* `TEXLIVE_IMAGE` - The TeX Live Docker image to use for sibling containers, e.g. `gcr.io/overleaf-ops/texlive-full:2017.1`
* `TEX_LIVE_IMAGE_NAME_OVERRIDE` - The name of the registry for the Docker image e.g. `gcr.io/overleaf-ops`
* `TEXLIVE_IMAGE_USER` - When using sibling containers, the user to run as in the TeX Live image. Defaults to `tex`
* `TEXLIVE_OPENOUT_ANY` - Sets the `openout_any` environment variable for TeX Live (see the `\openout` primitive [documentation](http://tug.org/texinfohtml/web2c.html#tex-invocation))
Further environment variables configure the [metrics module](https://github.com/overleaf/metrics-module)
Installation
------------
The CLSI can be installed and set up as part of the entire [Overleaf stack](https://github.com/overleaf/overleaf) (complete with front end editor and document storage), or it can be run as a standalone service. To run is as a standalone service, first checkout this repository:
```shell
git clone git@github.com:overleaf/overleaf.git
```
Then build the Docker image:
```shell
docker build . -t overleaf/clsi -f services/clsi/Dockerfile
```
Then pull the TeX Live image:
```shell
docker pull texlive/texlive
```
Then start the Docker container:
```shell
docker run --rm \
-p 127.0.0.1:3013:3013 \
-e LISTEN_ADDRESS=0.0.0.0 \
-e DOCKER_RUNNER=true \
-e TEXLIVE_IMAGE=texlive/texlive \
-e TEXLIVE_IMAGE_USER=root \
-e COMPILES_HOST_DIR="$PWD/compiles" \
-v "$PWD/compiles:/overleaf/services/clsi/compiles" \
-v "$PWD/cache:/overleaf/services/clsi/cache" \
-v /var/run/docker.sock:/var/run/docker.sock \
--name clsi \
overleaf/clsi
```
Note: if you're running the CLSI in macOS you may need to use `-v /var/run/docker.sock.raw:/var/run/docker.sock` instead.
The CLSI should then be running at <http://localhost:3013>
Important note for Linux users
==============================
The Node application runs as user `node` in the CLSI, which has uid `1000`. As a consequence of this, the `compiles` folder gets created on your host with `uid` and `gid` set to `1000`.
```shell
ls -lnd compiles
```
> `drwxr-xr-x 2 1000 1000 4096 Mar 19 12:41 compiles`
If there is a user/group on your host which also happens to have `uid` / `gid` `1000` then that user/group will have ownership of the compiles folder on your host.
LaTeX runs in the sibling containers as the user specified in the `TEXLIVE_IMAGE_USER` environment variable. In the example above this is set to `root`, which has uid `0`. This creates a problem with the above permissions, as the root user does not have permission to write to subfolders of `compiles`.
A quick fix is to give the `root` group ownership and read write permissions to `compiles`, with `setgid` set so that new subfolders also inherit this ownership:
```shell
sudo chown -R 1000:root compiles
sudo chmod -R g+w compiles
sudo chmod g+s compiles
```
Another solution is to create a `overleaf` group and add both `root` and the user with `uid` `1000` to it. If the host does not have a user with that `uid`, you will need to create one first.
```shell
sudo useradd --uid 1000 host-node-user # If required
sudo groupadd overleaf
sudo usermod -a -G overleaf root
sudo usermod -a -G overleaf $(id -nu 1000)
sudo chown -R 1000:overleaf compiles
sudo chmod -R g+w compiles
sudo chmod g+s compiles
```
This is a facet of the way docker works on Linux. See this [upstream issue](https://github.com/moby/moby/issues/7198)
API
---
The CLSI is based on a JSON API.
#### Example Request
(Note that valid JSON should not contain any comments like the example below).
POST /project/<project-id>/compile
```json5
{
"compile": {
"options": {
// Which compiler to use. Can be latex, pdflatex, xelatex or lualatex
"compiler": "lualatex",
// How many seconds to wait before killing the process. Default is 60.
"timeout": 40
},
// The main file to run LaTeX on
"rootResourcePath": "main.tex",
// An array of files to include in the compilation. May have either the content
// passed directly, or a URL where it can be downloaded.
"resources": [
{
"path": "main.tex",
"content": "\\documentclass{article}\n\\begin{document}\nHello World\n\\end{document}"
}
// ,{
// "path": "image.png",
// "url": "www.example.com/image.png",
// "modified": 123456789 // Unix time since epoch
// }
]
}
}
```
With `curl`, if you place the above JSON in a file called `data.json`, the request would look like this:
```shell
curl -X POST -H 'Content-Type: application/json' -d @data.json http://localhost:3013/project/<id>/compile
```
You can specify any project-id in the URL, and the files and LaTeX environment will be persisted between requests.
URLs will be downloaded and cached until provided with a more recent modified date.
#### Example Response
```json
{
"compile": {
"status": "success",
"outputFiles": [{
"type": "pdf",
"url": "http://localhost:3013/project/<project-id>/output/output.pdf"
}, {
"type": "log",
"url": "http://localhost:3013/project/<project-id>/output/output.log"
}]
}
}
```
License
-------
The code in this repository is released under the GNU AFFERO GENERAL PUBLIC LICENSE, version 3. A copy can be found in the `LICENSE` file.
Copyright (c) Overleaf, 2014-2021.

386
services/clsi/app.js Normal file
View File

@@ -0,0 +1,386 @@
// Metrics must be initialized before importing anything else
require('@overleaf/metrics/initialize')
const CompileController = require('./app/js/CompileController')
const ContentController = require('./app/js/ContentController')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
logger.initialize('clsi')
const Metrics = require('@overleaf/metrics')
const smokeTest = require('./test/smoke/js/SmokeTests')
const ContentTypeMapper = require('./app/js/ContentTypeMapper')
const Errors = require('./app/js/Errors')
const { createOutputZip } = require('./app/js/OutputController')
const Path = require('node:path')
Metrics.open_sockets.monitor(true)
Metrics.memory.monitor(logger)
Metrics.leaked_sockets.monitor(logger)
const ProjectPersistenceManager = require('./app/js/ProjectPersistenceManager')
const OutputCacheManager = require('./app/js/OutputCacheManager')
const ContentCacheManager = require('./app/js/ContentCacheManager')
ProjectPersistenceManager.init()
OutputCacheManager.init()
const express = require('express')
const bodyParser = require('body-parser')
const app = express()
Metrics.injectMetricsRoute(app)
app.use(Metrics.http.monitor(logger))
// Compile requests can take longer than the default two
// minutes (including file download time), so bump up the
// timeout a bit.
const TIMEOUT = 10 * 60 * 1000
app.use(function (req, res, next) {
req.setTimeout(TIMEOUT)
res.setTimeout(TIMEOUT)
res.removeHeader('X-Powered-By')
next()
})
app.param('project_id', function (req, res, next, projectId) {
if (projectId?.match(/^[a-zA-Z0-9_-]+$/)) {
next()
} else {
next(new Error('invalid project id'))
}
})
app.param('user_id', function (req, res, next, userId) {
if (userId?.match(/^[0-9a-f]{24}$/)) {
next()
} else {
next(new Error('invalid user id'))
}
})
app.param('build_id', function (req, res, next, buildId) {
if (buildId?.match(OutputCacheManager.BUILD_REGEX)) {
next()
} else {
next(new Error(`invalid build id ${buildId}`))
}
})
app.param('contentId', function (req, res, next, contentId) {
if (contentId?.match(OutputCacheManager.CONTENT_REGEX)) {
next()
} else {
next(new Error(`invalid content id ${contentId}`))
}
})
app.param('hash', function (req, res, next, hash) {
if (hash?.match(ContentCacheManager.HASH_REGEX)) {
next()
} else {
next(new Error(`invalid hash ${hash}`))
}
})
app.post(
'/project/:project_id/compile',
bodyParser.json({ limit: Settings.compileSizeLimit }),
CompileController.compile
)
app.post('/project/:project_id/compile/stop', CompileController.stopCompile)
app.delete('/project/:project_id', CompileController.clearCache)
app.get('/project/:project_id/sync/code', CompileController.syncFromCode)
app.get('/project/:project_id/sync/pdf', CompileController.syncFromPdf)
app.get('/project/:project_id/wordcount', CompileController.wordcount)
app.get('/project/:project_id/status', CompileController.status)
app.post('/project/:project_id/status', CompileController.status)
// Per-user containers
app.post(
'/project/:project_id/user/:user_id/compile',
bodyParser.json({ limit: Settings.compileSizeLimit }),
CompileController.compile
)
app.post(
'/project/:project_id/user/:user_id/compile/stop',
CompileController.stopCompile
)
app.delete('/project/:project_id/user/:user_id', CompileController.clearCache)
app.get(
'/project/:project_id/user/:user_id/sync/code',
CompileController.syncFromCode
)
app.get(
'/project/:project_id/user/:user_id/sync/pdf',
CompileController.syncFromPdf
)
app.get(
'/project/:project_id/user/:user_id/wordcount',
CompileController.wordcount
)
const ForbidSymlinks = require('./app/js/StaticServerForbidSymlinks')
// create a static server which does not allow access to any symlinks
// avoids possible mismatch of root directory between middleware check
// and serving the files
const staticOutputServer = ForbidSymlinks(
express.static,
Settings.path.outputDir,
{
setHeaders(res, path, stat) {
if (Path.basename(path) === 'output.pdf') {
// Calculate an etag in the same way as nginx
// https://github.com/tj/send/issues/65
const etag = (path, stat) =>
`"${Math.ceil(+stat.mtime / 1000).toString(16)}` +
'-' +
Number(stat.size).toString(16) +
'"'
res.set('Etag', etag(path, stat))
}
res.set('Content-Type', ContentTypeMapper.map(path))
},
}
)
// This needs to be before GET /project/:project_id/build/:build_id/output/*
app.get(
'/project/:project_id/build/:build_id/output/output.zip',
bodyParser.json(),
createOutputZip
)
// This needs to be before GET /project/:project_id/user/:user_id/build/:build_id/output/*
app.get(
'/project/:project_id/user/:user_id/build/:build_id/output/output.zip',
bodyParser.json(),
createOutputZip
)
app.get(
'/project/:project_id/user/:user_id/build/:build_id/output/*',
function (req, res, next) {
// for specific build get the path from the OutputCacheManager (e.g. .clsi/buildId)
req.url =
`/${req.params.project_id}-${req.params.user_id}/` +
OutputCacheManager.path(req.params.build_id, `/${req.params[0]}`)
staticOutputServer(req, res, next)
}
)
app.get(
'/project/:projectId/content/:contentId/:hash',
ContentController.getPdfRange
)
app.get(
'/project/:projectId/user/:userId/content/:contentId/:hash',
ContentController.getPdfRange
)
app.get(
'/project/:project_id/build/:build_id/output/*',
function (req, res, next) {
// for specific build get the path from the OutputCacheManager (e.g. .clsi/buildId)
req.url =
`/${req.params.project_id}/` +
OutputCacheManager.path(req.params.build_id, `/${req.params[0]}`)
staticOutputServer(req, res, next)
}
)
app.get('/oops', function (req, res, next) {
logger.error({ err: 'hello' }, 'test error')
res.send('error\n')
})
app.get('/oops-internal', function (req, res, next) {
setTimeout(function () {
throw new Error('Test error')
}, 1)
})
app.get('/status', (req, res, next) => res.send('CLSI is alive\n'))
Settings.processTooOld = false
if (Settings.processLifespanLimitMs) {
// Pre-emp instances have a maximum lifespan of 24h after which they will be
// shutdown, with a 30s grace period.
// Spread cycling of VMs by up-to 2.4h _before_ their limit to avoid large
// numbers of VMs that are temporarily unavailable (while they reboot).
Settings.processLifespanLimitMs -=
Settings.processLifespanLimitMs * (Math.random() / 10)
logger.info(
{ target: new Date(Date.now() + Settings.processLifespanLimitMs) },
'Lifespan limited'
)
setTimeout(() => {
logger.info({}, 'shutting down, process is too old')
Settings.processTooOld = true
}, Settings.processLifespanLimitMs)
}
function runSmokeTest() {
if (Settings.processTooOld) return
const INTERVAL = 30 * 1000
if (
smokeTest.lastRunSuccessful() &&
CompileController.timeSinceLastSuccessfulCompile() < INTERVAL / 2
) {
logger.debug('skipping smoke tests, got recent successful user compile')
return setTimeout(runSmokeTest, INTERVAL / 2)
}
logger.debug('running smoke tests')
smokeTest.triggerRun(err => {
if (err) logger.error({ err }, 'smoke tests failed')
setTimeout(runSmokeTest, INTERVAL)
})
}
if (Settings.smokeTest) {
runSmokeTest()
}
app.get('/health_check', function (req, res) {
if (Settings.processTooOld) {
return res.status(500).json({ processTooOld: true })
}
smokeTest.sendLastResult(res)
})
app.get('/smoke_test_force', (req, res) => smokeTest.sendNewResult(res))
app.use(function (error, req, res, next) {
if (error instanceof Errors.NotFoundError) {
logger.debug({ err: error, url: req.url }, 'not found error')
res.sendStatus(404)
} else if (error instanceof Errors.InvalidParameter) {
res.status(400).send(error.message)
} else if (error.code === 'EPIPE') {
// inspect container returns EPIPE when shutting down
res.sendStatus(503) // send 503 Unavailable response
} else {
logger.error({ err: error, url: req.url }, 'server error')
res.sendStatus(error.statusCode || 500)
}
})
const net = require('node:net')
const os = require('node:os')
let STATE = 'up'
const loadTcpServer = net.createServer(function (socket) {
socket.on('error', function (err) {
if (err.code === 'ECONNRESET') {
// this always comes up, we don't know why
return
}
logger.err({ err }, 'error with socket on load check')
socket.destroy()
})
if (STATE === 'up' && Settings.internal.load_balancer_agent.report_load) {
let availableWorkingCpus
const currentLoad = os.loadavg()[0]
// staging clis's have 1 cpu core only
if (os.cpus().length === 1) {
availableWorkingCpus = 1
} else {
availableWorkingCpus = os.cpus().length - 1
}
const freeLoad = availableWorkingCpus - currentLoad
const freeLoadPercentage = Math.round(
(freeLoad / availableWorkingCpus) * 100
)
if (
Settings.internal.load_balancer_agent.allow_maintenance &&
freeLoadPercentage <= 0
) {
// When its 0 the server is set to drain implicitly.
// Drain will move new projects to different servers.
// Drain will keep existing projects assigned to the same server.
// Maint will more existing and new projects to different servers.
socket.write(`maint, 0%\n`, 'ASCII')
} else {
// Ready will cancel the maint state.
socket.write(`up, ready, ${Math.max(freeLoadPercentage, 1)}%\n`, 'ASCII')
if (freeLoadPercentage <= 0) {
// This metric records how often we would have gone into maintenance mode.
Metrics.inc('clsi-prevented-maint')
}
}
socket.end()
} else {
socket.write(`${STATE}\n`, 'ASCII')
socket.end()
}
})
const loadHttpServer = express()
loadHttpServer.post('/state/up', function (req, res, next) {
STATE = 'up'
logger.debug('getting message to set server to down')
res.sendStatus(204)
})
loadHttpServer.post('/state/down', function (req, res, next) {
STATE = 'down'
logger.debug('getting message to set server to down')
res.sendStatus(204)
})
loadHttpServer.post('/state/maint', function (req, res, next) {
STATE = 'maint'
logger.debug('getting message to set server to maint')
res.sendStatus(204)
})
const port = Settings.internal.clsi.port
const host = Settings.internal.clsi.host
const loadTcpPort = Settings.internal.load_balancer_agent.load_port
const loadHttpPort = Settings.internal.load_balancer_agent.local_port
if (!module.parent) {
// Called directly
// handle uncaught exceptions when running in production
if (Settings.catchErrors) {
process.removeAllListeners('uncaughtException')
process.on('uncaughtException', error =>
logger.error({ err: error }, 'uncaughtException')
)
}
app.listen(port, host, error => {
if (error) {
logger.fatal({ error }, `Error starting CLSI on ${host}:${port}`)
} else {
logger.debug(`CLSI starting up, listening on ${host}:${port}`)
}
})
loadTcpServer.listen(loadTcpPort, host, function (error) {
if (error != null) {
throw error
}
logger.debug(`Load tcp agent listening on load port ${loadTcpPort}`)
})
loadHttpServer.listen(loadHttpPort, host, function (error) {
if (error != null) {
throw error
}
logger.debug(`Load http agent listening on load port ${loadHttpPort}`)
})
}
module.exports = app

View File

@@ -0,0 +1,276 @@
const crypto = require('node:crypto')
const fs = require('node:fs')
const Path = require('node:path')
const { pipeline } = require('node:stream/promises')
const { createGzip, createGunzip } = require('node:zlib')
const tarFs = require('tar-fs')
const _ = require('lodash')
const {
fetchNothing,
fetchStream,
RequestFailedError,
} = require('@overleaf/fetch-utils')
const logger = require('@overleaf/logger')
const Metrics = require('@overleaf/metrics')
const Settings = require('@overleaf/settings')
const { CACHE_SUBDIR } = require('./OutputCacheManager')
const { isExtraneousFile } = require('./ResourceWriter')
const TIMING_BUCKETS = [
0, 10, 100, 1000, 2000, 5000, 10000, 15000, 20000, 30000,
]
const MAX_ENTRIES_IN_OUTPUT_TAR = 100
/**
* @param {string} projectId
* @param {string} userId
* @param {string} buildId
* @param {string} editorId
* @param {[{path: string}]} outputFiles
* @param {string} compileGroup
* @param {Record<string, any>} options
*/
function notifyCLSICacheAboutBuild({
projectId,
userId,
buildId,
editorId,
outputFiles,
compileGroup,
options,
}) {
if (!Settings.apis.clsiCache.enabled) return
/**
* @param {[{path: string}]} files
*/
const enqueue = files => {
Metrics.count('clsi_cache_enqueue_files', files.length)
fetchNothing(`${Settings.apis.clsiCache.url}/enqueue`, {
method: 'POST',
json: {
projectId,
userId,
buildId,
editorId,
files,
downloadHost: Settings.apis.clsi.downloadHost,
clsiServerId: Settings.apis.clsi.clsiServerId,
compileGroup,
options,
},
signal: AbortSignal.timeout(15_000),
}).catch(err => {
logger.warn(
{ err, projectId, userId, buildId },
'enqueue for clsi cache failed'
)
})
}
// PDF preview
enqueue(
outputFiles
.filter(
f =>
f.path === 'output.pdf' ||
f.path === 'output.log' ||
f.path === 'output.synctex.gz' ||
f.path.endsWith('.blg')
)
.map(f => {
if (f.path === 'output.pdf') {
return _.pick(f, 'path', 'size', 'contentId', 'ranges')
}
return _.pick(f, 'path')
})
)
// Compile Cache
buildTarball({ projectId, userId, buildId, outputFiles })
.then(() => {
enqueue([{ path: 'output.tar.gz' }])
})
.catch(err => {
logger.warn(
{ err, projectId, userId, buildId },
'build output.tar.gz for clsi cache failed'
)
})
}
/**
* @param {string} projectId
* @param {string} userId
* @param {string} buildId
* @param {[{path: string}]} outputFiles
* @return {Promise<void>}
*/
async function buildTarball({ projectId, userId, buildId, outputFiles }) {
const timer = new Metrics.Timer('clsi_cache_build', 1, {}, TIMING_BUCKETS)
const outputDir = Path.join(
Settings.path.outputDir,
userId ? `${projectId}-${userId}` : projectId,
CACHE_SUBDIR,
buildId
)
const files = outputFiles.filter(f => !isExtraneousFile(f.path))
if (files.length > MAX_ENTRIES_IN_OUTPUT_TAR) {
Metrics.inc('clsi_cache_build_too_many_entries')
throw new Error('too many output files for output.tar.gz')
}
Metrics.count('clsi_cache_build_files', files.length)
const path = Path.join(outputDir, 'output.tar.gz')
try {
await pipeline(
tarFs.pack(outputDir, { entries: files.map(f => f.path) }),
createGzip(),
fs.createWriteStream(path)
)
} catch (err) {
try {
await fs.promises.unlink(path)
} catch (e) {}
throw err
} finally {
timer.done()
}
}
/**
* @param {string} projectId
* @param {string} userId
* @param {string} editorId
* @param {string} buildId
* @param {string} outputDir
* @return {Promise<boolean>}
*/
async function downloadOutputDotSynctexFromCompileCache(
projectId,
userId,
editorId,
buildId,
outputDir
) {
if (!Settings.apis.clsiCache.enabled) return false
const timer = new Metrics.Timer(
'clsi_cache_download',
1,
{ method: 'synctex' },
TIMING_BUCKETS
)
let stream
try {
stream = await fetchStream(
`${Settings.apis.clsiCache.url}/project/${projectId}/${
userId ? `user/${userId}/` : ''
}build/${editorId}-${buildId}/search/output/output.synctex.gz`,
{
method: 'GET',
signal: AbortSignal.timeout(10_000),
}
)
} catch (err) {
if (err instanceof RequestFailedError && err.response.status === 404) {
timer.done({ status: 'not-found' })
return false
}
timer.done({ status: 'error' })
throw err
}
await fs.promises.mkdir(outputDir, { recursive: true })
const dst = Path.join(outputDir, 'output.synctex.gz')
const tmp = dst + crypto.randomUUID()
try {
await pipeline(stream, fs.createWriteStream(tmp))
await fs.promises.rename(tmp, dst)
} catch (err) {
try {
await fs.promises.unlink(tmp)
} catch {}
throw err
}
timer.done({ status: 'success' })
return true
}
/**
* @param {string} projectId
* @param {string} userId
* @param {string} compileDir
* @return {Promise<boolean>}
*/
async function downloadLatestCompileCache(projectId, userId, compileDir) {
if (!Settings.apis.clsiCache.enabled) return false
const url = `${Settings.apis.clsiCache.url}/project/${projectId}/${
userId ? `user/${userId}/` : ''
}latest/output/output.tar.gz`
const timer = new Metrics.Timer(
'clsi_cache_download',
1,
{ method: 'tar' },
TIMING_BUCKETS
)
let stream
try {
stream = await fetchStream(url, {
method: 'GET',
signal: AbortSignal.timeout(10_000),
})
} catch (err) {
if (err instanceof RequestFailedError && err.response.status === 404) {
timer.done({ status: 'not-found' })
return false
}
timer.done({ status: 'error' })
throw err
}
let n = 0
let abort = false
await pipeline(
stream,
createGunzip(),
tarFs.extract(compileDir, {
// use ignore hook for counting entries (files+folders) and validation.
// Include folders as they incur mkdir calls.
ignore(_, header) {
if (abort) return true // log once
n++
if (n > MAX_ENTRIES_IN_OUTPUT_TAR) {
abort = true
logger.warn(
{
url,
compileDir,
},
'too many entries in tar-ball from clsi-cache'
)
} else if (header.type !== 'file' && header.type !== 'directory') {
abort = true
logger.warn(
{
url,
compileDir,
entryType: header.type,
},
'unexpected entry in tar-ball from clsi-cache'
)
}
return abort
},
})
)
Metrics.count('clsi_cache_download_entries', n)
timer.done({ status: 'success' })
return !abort
}
module.exports = {
notifyCLSICacheAboutBuild,
downloadLatestCompileCache,
downloadOutputDotSynctexFromCompileCache,
}

View File

@@ -0,0 +1,20 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let commandRunnerPath
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
if ((Settings.clsi != null ? Settings.clsi.dockerRunner : undefined) === true) {
commandRunnerPath = './DockerRunner'
} else {
commandRunnerPath = './LocalCommandRunner'
}
logger.debug({ commandRunnerPath }, 'selecting command runner for clsi')
const CommandRunner = require(commandRunnerPath)
module.exports = CommandRunner

View File

@@ -0,0 +1,276 @@
const Path = require('node:path')
const RequestParser = require('./RequestParser')
const CompileManager = require('./CompileManager')
const Settings = require('@overleaf/settings')
const Metrics = require('./Metrics')
const ProjectPersistenceManager = require('./ProjectPersistenceManager')
const logger = require('@overleaf/logger')
const Errors = require('./Errors')
const { notifyCLSICacheAboutBuild } = require('./CLSICacheHandler')
let lastSuccessfulCompileTimestamp = 0
function timeSinceLastSuccessfulCompile() {
return Date.now() - lastSuccessfulCompileTimestamp
}
function compile(req, res, next) {
const timer = new Metrics.Timer('compile-request')
RequestParser.parse(req.body, function (error, request) {
if (error) {
return next(error)
}
timer.opts = request.metricsOpts
request.project_id = req.params.project_id
if (req.params.user_id != null) {
request.user_id = req.params.user_id
}
ProjectPersistenceManager.markProjectAsJustAccessed(
request.project_id,
function (error) {
if (error) {
return next(error)
}
const stats = {}
const timings = {}
CompileManager.doCompileWithLock(
request,
stats,
timings,
(error, result) => {
let { buildId, outputFiles } = result || {}
let code, status
if (outputFiles == null) {
outputFiles = []
}
if (error instanceof Errors.AlreadyCompilingError) {
code = 423 // Http 423 Locked
status = 'compile-in-progress'
} else if (error instanceof Errors.FilesOutOfSyncError) {
code = 409 // Http 409 Conflict
status = 'retry'
logger.warn(
{
projectId: request.project_id,
userId: request.user_id,
},
'files out of sync, please retry'
)
} else if (
error?.code === 'EPIPE' ||
error instanceof Errors.TooManyCompileRequestsError
) {
// docker returns EPIPE when shutting down
code = 503 // send 503 Unavailable response
status = 'unavailable'
} else if (error?.terminated) {
status = 'terminated'
} else if (error?.validate) {
status = `validation-${error.validate}`
} else if (error?.timedout) {
status = 'timedout'
logger.debug(
{ err: error, projectId: request.project_id },
'timeout running compile'
)
} else if (error) {
status = 'error'
code = 500
logger.error(
{ err: error, projectId: request.project_id },
'error running compile'
)
} else {
if (
outputFiles.some(
file => file.path === 'output.pdf' && file.size > 0
)
) {
status = 'success'
lastSuccessfulCompileTimestamp = Date.now()
} else if (request.stopOnFirstError) {
status = 'stopped-on-first-error'
} else {
status = 'failure'
logger.warn(
{ projectId: request.project_id, outputFiles },
'project failed to compile successfully, no output.pdf generated'
)
}
// log an error if any core files are found
if (outputFiles.some(file => file.path === 'core')) {
logger.error(
{ projectId: request.project_id, req, outputFiles },
'core file found in output'
)
}
}
if (error) {
outputFiles = error.outputFiles || []
buildId = error.buildId
}
if (
status === 'success' &&
request.editorId &&
request.populateClsiCache
) {
notifyCLSICacheAboutBuild({
projectId: request.project_id,
userId: request.user_id,
buildId: outputFiles[0].build,
editorId: request.editorId,
outputFiles,
compileGroup: request.compileGroup,
options: {
compiler: request.compiler,
draft: request.draft,
imageName: request.imageName
? Path.basename(request.imageName)
: undefined,
rootResourcePath: request.rootResourcePath,
stopOnFirstError: request.stopOnFirstError,
},
})
}
timer.done()
res.status(code || 200).send({
compile: {
status,
error: error?.message || error,
stats,
timings,
buildId,
outputUrlPrefix: Settings.apis.clsi.outputUrlPrefix,
outputFiles: outputFiles.map(file => ({
url:
`${Settings.apis.clsi.url}/project/${request.project_id}` +
(request.user_id != null
? `/user/${request.user_id}`
: '') +
`/build/${file.build}/output/${file.path}`,
...file,
})),
},
})
}
)
}
)
})
}
function stopCompile(req, res, next) {
const { project_id: projectId, user_id: userId } = req.params
CompileManager.stopCompile(projectId, userId, function (error) {
if (error) {
return next(error)
}
res.sendStatus(204)
})
}
function clearCache(req, res, next) {
ProjectPersistenceManager.clearProject(
req.params.project_id,
req.params.user_id,
function (error) {
if (error) {
return next(error)
}
// No content
res.sendStatus(204)
}
)
}
function syncFromCode(req, res, next) {
const { file, editorId, buildId, compileFromClsiCache } = req.query
const line = parseInt(req.query.line, 10)
const column = parseInt(req.query.column, 10)
const { imageName } = req.query
const projectId = req.params.project_id
const userId = req.params.user_id
CompileManager.syncFromCode(
projectId,
userId,
file,
line,
column,
{ imageName, editorId, buildId, compileFromClsiCache },
function (error, pdfPositions) {
if (error) {
return next(error)
}
res.json({
pdf: pdfPositions,
})
}
)
}
function syncFromPdf(req, res, next) {
const page = parseInt(req.query.page, 10)
const h = parseFloat(req.query.h)
const v = parseFloat(req.query.v)
const { imageName, editorId, buildId, compileFromClsiCache } = req.query
const projectId = req.params.project_id
const userId = req.params.user_id
CompileManager.syncFromPdf(
projectId,
userId,
page,
h,
v,
{ imageName, editorId, buildId, compileFromClsiCache },
function (error, codePositions) {
if (error) {
return next(error)
}
res.json({
code: codePositions,
})
}
)
}
function wordcount(req, res, next) {
const file = req.query.file || 'main.tex'
const projectId = req.params.project_id
const userId = req.params.user_id
const { image } = req.query
logger.debug({ image, file, projectId }, 'word count request')
CompileManager.wordcount(
projectId,
userId,
file,
image,
function (error, result) {
if (error) {
return next(error)
}
res.json({
texcount: result,
})
}
)
}
function status(req, res, next) {
res.send('OK')
}
module.exports = {
compile,
stopCompile,
clearCache,
syncFromCode,
syncFromPdf,
wordcount,
status,
timeSinceLastSuccessfulCompile,
}

View File

@@ -0,0 +1,701 @@
const fsPromises = require('node:fs/promises')
const os = require('node:os')
const Path = require('node:path')
const { callbackify } = require('node:util')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const OError = require('@overleaf/o-error')
const ResourceWriter = require('./ResourceWriter')
const LatexRunner = require('./LatexRunner')
const OutputFileFinder = require('./OutputFileFinder')
const OutputCacheManager = require('./OutputCacheManager')
const Metrics = require('./Metrics')
const DraftModeManager = require('./DraftModeManager')
const TikzManager = require('./TikzManager')
const LockManager = require('./LockManager')
const Errors = require('./Errors')
const CommandRunner = require('./CommandRunner')
const { emitPdfStats } = require('./ContentCacheMetrics')
const SynctexOutputParser = require('./SynctexOutputParser')
const {
downloadLatestCompileCache,
downloadOutputDotSynctexFromCompileCache,
} = require('./CLSICacheHandler')
const COMPILE_TIME_BUCKETS = [
// NOTE: These buckets are locked in per metric name.
// If you want to change them, you will need to rename metrics.
0, 1, 2, 3, 4, 6, 8, 11, 15, 22, 31, 43, 61, 86, 121, 170, 240,
].map(seconds => seconds * 1000)
function getCompileName(projectId, userId) {
if (userId != null) {
return `${projectId}-${userId}`
} else {
return projectId
}
}
function getCompileDir(projectId, userId) {
return Path.join(Settings.path.compilesDir, getCompileName(projectId, userId))
}
function getOutputDir(projectId, userId) {
return Path.join(Settings.path.outputDir, getCompileName(projectId, userId))
}
async function doCompileWithLock(request, stats, timings) {
const compileDir = getCompileDir(request.project_id, request.user_id)
request.isInitialCompile =
(await fsPromises.mkdir(compileDir, { recursive: true })) === compileDir
// prevent simultaneous compiles
const lock = LockManager.acquire(compileDir)
try {
return await doCompile(request, stats, timings)
} finally {
lock.release()
}
}
async function doCompile(request, stats, timings) {
const { project_id: projectId, user_id: userId } = request
const compileDir = getCompileDir(request.project_id, request.user_id)
const timerE2E = new Metrics.Timer(
'compile-e2e-v2',
1,
request.metricsOpts,
COMPILE_TIME_BUCKETS
)
if (request.isInitialCompile) {
stats.isInitialCompile = 1
request.metricsOpts.compile = 'initial'
if (request.compileFromClsiCache) {
try {
if (await downloadLatestCompileCache(projectId, userId, compileDir)) {
stats.restoredClsiCache = 1
request.metricsOpts.compile = 'from-clsi-cache'
}
} catch (err) {
logger.warn(
{ err, projectId, userId },
'failed to populate compile dir from cache'
)
}
}
} else {
request.metricsOpts.compile = 'recompile'
}
const writeToDiskTimer = new Metrics.Timer(
'write-to-disk',
1,
request.metricsOpts
)
logger.debug(
{ projectId: request.project_id, userId: request.user_id },
'syncing resources to disk'
)
let resourceList
try {
// NOTE: resourceList is insecure, it should only be used to exclude files from the output list
resourceList = await ResourceWriter.promises.syncResourcesToDisk(
request,
compileDir
)
} catch (error) {
if (error instanceof Errors.FilesOutOfSyncError) {
OError.tag(error, 'files out of sync, please retry', {
projectId: request.project_id,
userId: request.user_id,
})
} else {
OError.tag(error, 'error writing resources to disk', {
projectId: request.project_id,
userId: request.user_id,
})
}
throw error
}
logger.debug(
{
projectId: request.project_id,
userId: request.user_id,
timeTaken: Date.now() - writeToDiskTimer.start,
},
'written files to disk'
)
timings.sync = writeToDiskTimer.done()
// set up environment variables for chktex
const env = {
OVERLEAF_PROJECT_ID: request.project_id,
}
if (Settings.texliveOpenoutAny && Settings.texliveOpenoutAny !== '') {
// override default texlive openout_any environment variable
env.openout_any = Settings.texliveOpenoutAny
}
if (Settings.texliveMaxPrintLine && Settings.texliveMaxPrintLine !== '') {
// override default texlive max_print_line environment variable
env.max_print_line = Settings.texliveMaxPrintLine
}
// only run chktex on LaTeX files (not knitr .Rtex files or any others)
const isLaTeXFile = request.rootResourcePath?.match(/\.tex$/i)
if (request.check != null && isLaTeXFile) {
env.CHKTEX_OPTIONS = '-nall -e9 -e10 -w15 -w16'
env.CHKTEX_ULIMIT_OPTIONS = '-t 5 -v 64000'
if (request.check === 'error') {
env.CHKTEX_EXIT_ON_ERROR = 1
}
if (request.check === 'validate') {
env.CHKTEX_VALIDATE = 1
}
}
// apply a series of file modifications/creations for draft mode and tikz
if (request.draft) {
await DraftModeManager.promises.injectDraftMode(
Path.join(compileDir, request.rootResourcePath)
)
}
const needsMainFile = await TikzManager.promises.checkMainFile(
compileDir,
request.rootResourcePath,
resourceList
)
if (needsMainFile) {
await TikzManager.promises.injectOutputFile(
compileDir,
request.rootResourcePath
)
}
const compileTimer = new Metrics.Timer('run-compile', 1, request.metricsOpts)
// find the image tag to log it as a metric, e.g. 2015.1 (convert . to - for graphite)
let tag = 'default'
if (request.imageName != null) {
const match = request.imageName.match(/:(.*)/)
if (match != null) {
tag = match[1].replace(/\./g, '-')
}
}
// exclude smoke test
if (!request.project_id.match(/^[0-9a-f]{24}$/)) {
tag = 'other'
}
Metrics.inc('compiles', 1, request.metricsOpts)
Metrics.inc(`compiles-with-image.${tag}`, 1, request.metricsOpts)
const compileName = getCompileName(request.project_id, request.user_id)
try {
await LatexRunner.promises.runLatex(compileName, {
directory: compileDir,
mainFile: request.rootResourcePath,
compiler: request.compiler,
timeout: request.timeout,
image: request.imageName,
flags: request.flags,
environment: env,
compileGroup: request.compileGroup,
stopOnFirstError: request.stopOnFirstError,
stats,
timings,
})
// We use errors to return the validation state. It would be nice to use a
// more appropriate mechanism.
if (request.check === 'validate') {
const validationError = new Error('validation')
validationError.validate = 'pass'
throw validationError
}
} catch (originalError) {
let error = originalError
// request was for validation only
if (request.check === 'validate' && !error.validate) {
error = new Error('validation')
error.validate = originalError.code ? 'fail' : 'pass'
}
// request was for compile, and failed on validation
if (request.check === 'error' && originalError.message === 'exited') {
error = new Error('compilation')
error.validate = 'fail'
}
// record timeout errors as a separate counter, success is recorded later
if (error.timedout) {
Metrics.inc('compiles-timeout', 1, request.metricsOpts)
}
const { outputFiles, allEntries, buildId } = await _saveOutputFiles({
request,
compileDir,
resourceList,
stats,
timings,
})
error.outputFiles = outputFiles // return output files so user can check logs
error.buildId = buildId
// Clear project if this compile was abruptly terminated
if (error.terminated || error.timedout) {
await clearProjectWithListing(
request.project_id,
request.user_id,
allEntries
)
}
throw error
}
// compile completed normally
Metrics.inc('compiles-succeeded', 1, request.metricsOpts)
for (const metricKey in stats) {
const metricValue = stats[metricKey]
Metrics.count(metricKey, metricValue, 1, request.metricsOpts)
}
for (const metricKey in timings) {
const metricValue = timings[metricKey]
Metrics.timing(metricKey, metricValue, 1, request.metricsOpts)
}
const loadavg = typeof os.loadavg === 'function' ? os.loadavg() : undefined
if (loadavg != null) {
Metrics.gauge('load-avg', loadavg[0])
}
const ts = compileTimer.done()
logger.debug(
{
projectId: request.project_id,
userId: request.user_id,
timeTaken: ts,
stats,
timings,
loadavg,
},
'done compile'
)
if (stats['latex-runs'] > 0) {
Metrics.histogram(
'avg-compile-per-pass-v2',
ts / stats['latex-runs'],
COMPILE_TIME_BUCKETS,
request.metricsOpts
)
Metrics.timing(
'avg-compile-per-pass-v2',
ts / stats['latex-runs'],
1,
request.metricsOpts
)
}
if (stats['latex-runs'] > 0 && timings['cpu-time'] > 0) {
Metrics.timing(
'run-compile-cpu-time-per-pass',
timings['cpu-time'] / stats['latex-runs'],
1,
request.metricsOpts
)
}
// Emit compile time.
timings.compile = ts
const { outputFiles, buildId } = await _saveOutputFiles({
request,
compileDir,
resourceList,
stats,
timings,
})
// Emit e2e compile time.
timings.compileE2E = timerE2E.done()
Metrics.timing('compile-e2e-v2', timings.compileE2E, 1, request.metricsOpts)
if (stats['pdf-size']) {
emitPdfStats(stats, timings, request)
}
return { outputFiles, buildId }
}
async function _saveOutputFiles({
request,
compileDir,
resourceList,
stats,
timings,
}) {
const timer = new Metrics.Timer(
'process-output-files',
1,
request.metricsOpts
)
const outputDir = getOutputDir(request.project_id, request.user_id)
const { outputFiles: rawOutputFiles, allEntries } =
await OutputFileFinder.promises.findOutputFiles(resourceList, compileDir)
const { buildId, outputFiles } =
await OutputCacheManager.promises.saveOutputFiles(
{ request, stats, timings },
rawOutputFiles,
compileDir,
outputDir
)
timings.output = timer.done()
return { outputFiles, allEntries, buildId }
}
async function stopCompile(projectId, userId) {
const compileName = getCompileName(projectId, userId)
await LatexRunner.promises.killLatex(compileName)
}
async function clearProject(projectId, userId) {
const compileDir = getCompileDir(projectId, userId)
await fsPromises.rm(compileDir, { force: true, recursive: true })
}
async function clearProjectWithListing(projectId, userId, allEntries) {
const compileDir = getCompileDir(projectId, userId)
const exists = await _checkDirectory(compileDir)
if (!exists) {
// skip removal if no directory present
return
}
for (const pathInProject of allEntries) {
const path = Path.join(compileDir, pathInProject)
if (path.endsWith('/')) {
await fsPromises.rmdir(path)
} else {
await fsPromises.unlink(path)
}
}
await fsPromises.rmdir(compileDir)
}
async function _findAllDirs() {
const root = Settings.path.compilesDir
const files = await fsPromises.readdir(root)
const allDirs = files.map(file => Path.join(root, file))
return allDirs
}
async function clearExpiredProjects(maxCacheAgeMs) {
const now = Date.now()
const dirs = await _findAllDirs()
for (const dir of dirs) {
let stats
try {
stats = await fsPromises.stat(dir)
} catch (err) {
// ignore errors checking directory
continue
}
const age = now - stats.mtime
const hasExpired = age > maxCacheAgeMs
if (hasExpired) {
await fsPromises.rm(dir, { force: true, recursive: true })
}
}
}
async function _checkDirectory(compileDir) {
let stats
try {
stats = await fsPromises.lstat(compileDir)
} catch (err) {
if (err.code === 'ENOENT') {
// directory does not exist
return false
}
OError.tag(err, 'error on stat of project directory for removal', {
dir: compileDir,
})
throw err
}
if (!stats.isDirectory()) {
throw new OError('project directory is not directory', {
dir: compileDir,
stats,
})
}
return true
}
async function syncFromCode(projectId, userId, filename, line, column, opts) {
// If LaTeX was run in a virtual environment, the file path that synctex expects
// might not match the file path on the host. The .synctex.gz file however, will be accessed
// wherever it is on the host.
const compileName = getCompileName(projectId, userId)
const baseDir = Settings.path.synctexBaseDir(compileName)
const inputFilePath = Path.join(baseDir, filename)
const outputFilePath = Path.join(baseDir, 'output.pdf')
const command = [
'synctex',
'view',
'-i',
`${line}:${column}:${inputFilePath}`,
'-o',
outputFilePath,
]
const stdout = await _runSynctex(projectId, userId, command, opts)
logger.debug(
{ projectId, userId, filename, line, column, command, stdout },
'synctex code output'
)
return SynctexOutputParser.parseViewOutput(stdout)
}
async function syncFromPdf(projectId, userId, page, h, v, opts) {
const compileName = getCompileName(projectId, userId)
const baseDir = Settings.path.synctexBaseDir(compileName)
const outputFilePath = `${baseDir}/output.pdf`
const command = [
'synctex',
'edit',
'-o',
`${page}:${h}:${v}:${outputFilePath}`,
]
const stdout = await _runSynctex(projectId, userId, command, opts)
logger.debug({ projectId, userId, page, h, v, stdout }, 'synctex pdf output')
return SynctexOutputParser.parseEditOutput(stdout, baseDir)
}
async function _checkFileExists(dir, filename) {
try {
await fsPromises.stat(dir)
} catch (error) {
if (error.code === 'ENOENT') {
throw new Errors.NotFoundError('no output directory')
}
throw error
}
const file = Path.join(dir, filename)
let stats
try {
stats = await fsPromises.stat(file)
} catch (error) {
if (error.code === 'ENOENT') {
throw new Errors.NotFoundError('no output file')
}
}
if (!stats.isFile()) {
throw new Error('not a file')
}
}
async function _runSynctex(projectId, userId, command, opts) {
const { imageName, editorId, buildId, compileFromClsiCache } = opts
if (imageName && !_isImageNameAllowed(imageName)) {
throw new Errors.InvalidParameter('invalid image')
}
if (editorId && !/^[a-f0-9-]+$/.test(editorId)) {
throw new Errors.InvalidParameter('invalid editorId')
}
if (buildId && !OutputCacheManager.BUILD_REGEX.test(buildId)) {
throw new Errors.InvalidParameter('invalid buildId')
}
const outputDir = getOutputDir(projectId, userId)
const runInOutputDir = buildId && CommandRunner.canRunSyncTeXInOutputDir()
const directory = runInOutputDir
? Path.join(outputDir, OutputCacheManager.CACHE_SUBDIR, buildId)
: getCompileDir(projectId, userId)
const timeout = 60 * 1000 // increased to allow for large projects
const compileName = getCompileName(projectId, userId)
const compileGroup = runInOutputDir ? 'synctex-output' : 'synctex'
const defaultImageName =
Settings.clsi && Settings.clsi.docker && Settings.clsi.docker.image
// eslint-disable-next-line @typescript-eslint/return-await
return await OutputCacheManager.promises.queueDirOperation(
outputDir,
/**
* @return {Promise<string>}
*/
async () => {
try {
await _checkFileExists(directory, 'output.synctex.gz')
} catch (err) {
if (
err instanceof Errors.NotFoundError &&
compileFromClsiCache &&
editorId &&
buildId
) {
try {
await downloadOutputDotSynctexFromCompileCache(
projectId,
userId,
editorId,
buildId,
directory
)
} catch (err) {
logger.warn(
{ err, projectId, userId, editorId, buildId },
'failed to download output.synctex.gz from clsi-cache'
)
}
await _checkFileExists(directory, 'output.synctex.gz')
} else {
throw err
}
}
try {
const output = await CommandRunner.promises.run(
compileName,
command,
directory,
imageName || defaultImageName,
timeout,
{},
compileGroup
)
return output.stdout
} catch (error) {
throw OError.tag(error, 'error running synctex', {
command,
projectId,
userId,
})
}
}
)
}
async function wordcount(projectId, userId, filename, image) {
logger.debug({ projectId, userId, filename, image }, 'running wordcount')
const filePath = `$COMPILE_DIR/${filename}`
const command = ['texcount', '-nocol', '-inc', filePath]
const compileDir = getCompileDir(projectId, userId)
const timeout = 60 * 1000
const compileName = getCompileName(projectId, userId)
const compileGroup = 'wordcount'
if (image && !_isImageNameAllowed(image)) {
throw new Errors.InvalidParameter('invalid image')
}
try {
await fsPromises.mkdir(compileDir, { recursive: true })
} catch (err) {
throw OError.tag(err, 'error ensuring dir for wordcount', {
projectId,
userId,
filename,
})
}
try {
const { stdout } = await CommandRunner.promises.run(
compileName,
command,
compileDir,
image,
timeout,
{},
compileGroup
)
const results = _parseWordcountFromOutput(stdout)
logger.debug(
{ projectId, userId, wordcount: results },
'word count results'
)
return results
} catch (err) {
throw OError.tag(err, 'error reading word count output', {
command,
compileDir,
projectId,
userId,
})
}
}
function _parseWordcountFromOutput(output) {
const results = {
encode: '',
textWords: 0,
headWords: 0,
outside: 0,
headers: 0,
elements: 0,
mathInline: 0,
mathDisplay: 0,
errors: 0,
messages: '',
}
for (const line of output.split('\n')) {
const [data, info] = line.split(':')
if (data.indexOf('Encoding') > -1) {
results.encode = info.trim()
}
if (data.indexOf('in text') > -1) {
results.textWords = parseInt(info, 10)
}
if (data.indexOf('in head') > -1) {
results.headWords = parseInt(info, 10)
}
if (data.indexOf('outside') > -1) {
results.outside = parseInt(info, 10)
}
if (data.indexOf('of head') > -1) {
results.headers = parseInt(info, 10)
}
if (data.indexOf('Number of floats/tables/figures') > -1) {
results.elements = parseInt(info, 10)
}
if (data.indexOf('Number of math inlines') > -1) {
results.mathInline = parseInt(info, 10)
}
if (data.indexOf('Number of math displayed') > -1) {
results.mathDisplay = parseInt(info, 10)
}
if (data === '(errors') {
// errors reported as (errors:123)
results.errors = parseInt(info, 10)
}
if (line.indexOf('!!! ') > -1) {
// errors logged as !!! message !!!
results.messages += line + '\n'
}
}
return results
}
function _isImageNameAllowed(imageName) {
const ALLOWED_IMAGES =
Settings.clsi && Settings.clsi.docker && Settings.clsi.docker.allowedImages
return !ALLOWED_IMAGES || ALLOWED_IMAGES.includes(imageName)
}
module.exports = {
doCompileWithLock: callbackify(doCompileWithLock),
stopCompile: callbackify(stopCompile),
clearProject: callbackify(clearProject),
clearExpiredProjects: callbackify(clearExpiredProjects),
syncFromCode: callbackify(syncFromCode),
syncFromPdf: callbackify(syncFromPdf),
wordcount: callbackify(wordcount),
promises: {
doCompileWithLock,
stopCompile,
clearProject,
clearExpiredProjects,
syncFromCode,
syncFromPdf,
wordcount,
},
}

View File

@@ -0,0 +1,441 @@
/**
* ContentCacheManager - maintains a cache of stream hashes from a PDF file
*/
const { callbackify } = require('node:util')
const fs = require('node:fs')
const crypto = require('node:crypto')
const Path = require('node:path')
const Settings = require('@overleaf/settings')
const OError = require('@overleaf/o-error')
const pLimit = require('p-limit')
const { parseXrefTable } = require('./XrefParser')
const {
QueueLimitReachedError,
TimedOutError,
NoXrefTableError,
} = require('./Errors')
const workerpool = require('workerpool')
const Metrics = require('@overleaf/metrics')
/**
* @type {import('workerpool').WorkerPool}
*/
let WORKER_POOL
// NOTE: Check for main thread to avoid recursive start of pool.
if (Settings.pdfCachingEnableWorkerPool && workerpool.isMainThread) {
WORKER_POOL = workerpool.pool(Path.join(__dirname, 'ContentCacheWorker.js'), {
// Cap number of worker threads.
maxWorkers: Settings.pdfCachingWorkerPoolSize,
// Warmup workers.
minWorkers: Settings.pdfCachingWorkerPoolSize,
// Limit queue back-log
maxQueueSize: Settings.pdfCachingWorkerPoolBackLogLimit,
})
setInterval(() => {
const {
totalWorkers,
busyWorkers,
idleWorkers,
pendingTasks,
activeTasks,
} = WORKER_POOL.stats()
Metrics.gauge('pdf_caching_total_workers', totalWorkers)
Metrics.gauge('pdf_caching_busy_workers', busyWorkers)
Metrics.gauge('pdf_caching_idle_workers', idleWorkers)
Metrics.gauge('pdf_caching_pending_tasks', pendingTasks)
Metrics.gauge('pdf_caching_active_tasks', activeTasks)
}, 15 * 1000)
}
/**
*
* @param {String} contentDir path to directory where content hash files are cached
* @param {String} filePath the pdf file to scan for streams
* @param {number} pdfSize the pdf size
* @param {number} pdfCachingMinChunkSize per request threshold
* @param {number} compileTime
*/
async function update({
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
}) {
if (pdfSize < pdfCachingMinChunkSize) {
return {
contentRanges: [],
newContentRanges: [],
reclaimedSpace: 0,
startXRefTable: undefined,
}
}
if (Settings.pdfCachingEnableWorkerPool) {
return await updateOtherEventLoop({
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
})
} else {
return await updateSameEventLoop({
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
})
}
}
/**
*
* @param {String} contentDir path to directory where content hash files are cached
* @param {String} filePath the pdf file to scan for streams
* @param {number} pdfSize the pdf size
* @param {number} pdfCachingMinChunkSize per request threshold
* @param {number} compileTime
*/
async function updateOtherEventLoop({
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
}) {
const workerLatencyInMs = 100
// Prefer getting the timeout error from the worker vs timing out the worker.
const timeout = getMaxOverhead(compileTime) + workerLatencyInMs
try {
return await WORKER_POOL.exec('updateSameEventLoop', [
{
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
},
]).timeout(timeout)
} catch (e) {
if (e instanceof workerpool.Promise.TimeoutError) {
throw new TimedOutError('context-lost-in-worker', { timeout })
}
if (e.message?.includes?.('Max queue size of ')) {
throw new QueueLimitReachedError()
}
if (e.message?.includes?.('xref')) {
throw new NoXrefTableError(e.message)
}
throw e
}
}
/**
*
* @param {String} contentDir path to directory where content hash files are cached
* @param {String} filePath the pdf file to scan for streams
* @param {number} pdfSize the pdf size
* @param {number} pdfCachingMinChunkSize per request threshold
* @param {number} compileTime
*/
async function updateSameEventLoop({
contentDir,
filePath,
pdfSize,
pdfCachingMinChunkSize,
compileTime,
}) {
const checkDeadline = getDeadlineChecker(compileTime)
// keep track of hashes expire old ones when they reach a generation > N.
const tracker = await HashFileTracker.from(contentDir)
tracker.updateAge()
checkDeadline('after init HashFileTracker')
const [reclaimedSpace, overheadDeleteStaleHashes] =
await tracker.deleteStaleHashes(5)
checkDeadline('after delete stale hashes')
const { xRefEntries, startXRefTable } = await parseXrefTable(
filePath,
pdfSize
)
xRefEntries.sort((a, b) => {
return a.offset - b.offset
})
xRefEntries.forEach((obj, idx) => {
obj.idx = idx
})
checkDeadline('after parsing')
const uncompressedObjects = []
for (const object of xRefEntries) {
if (!object.uncompressed) {
continue
}
const nextObject = xRefEntries[object.idx + 1]
if (!nextObject) {
// Ignore this possible edge case.
// The last object should be part of the xRef table.
continue
} else {
object.endOffset = nextObject.offset
}
const size = object.endOffset - object.offset
object.size = size
if (size < pdfCachingMinChunkSize) {
continue
}
uncompressedObjects.push({ object, idx: uncompressedObjects.length })
}
checkDeadline('after finding uncompressed')
let timedOutErr = null
const contentRanges = []
const newContentRanges = []
const handle = await fs.promises.open(filePath)
try {
for (const { object, idx } of uncompressedObjects) {
let buffer = Buffer.alloc(object.size, 0)
const { bytesRead } = await handle.read(
buffer,
0,
object.size,
object.offset
)
checkDeadline('after read ' + idx)
if (bytesRead !== object.size) {
throw new OError('could not read full chunk', {
object,
bytesRead,
})
}
const idxObj = buffer.indexOf('obj')
if (idxObj > 100) {
throw new OError('objectId is too large', {
object,
idxObj,
})
}
const objectIdRaw = buffer.subarray(0, idxObj)
buffer = buffer.subarray(objectIdRaw.byteLength)
const hash = pdfStreamHash(buffer)
checkDeadline('after hash ' + idx)
const range = {
objectId: objectIdRaw.toString(),
start: object.offset + objectIdRaw.byteLength,
end: object.endOffset,
hash,
}
if (tracker.has(range.hash)) {
// Optimization: Skip writing of already seen hashes.
tracker.track(range)
contentRanges.push(range)
continue
}
await writePdfStream(contentDir, hash, buffer)
tracker.track(range)
contentRanges.push(range)
newContentRanges.push(range)
checkDeadline('after write ' + idx)
}
} catch (err) {
if (err instanceof TimedOutError) {
// Let the frontend use ranges that were processed so far.
timedOutErr = err
} else {
throw err
}
} finally {
await handle.close()
// Flush from both success and failure code path. This allows the next
// cycle to complete faster as it can use the already written ranges.
await tracker.flush()
}
return {
contentRanges,
newContentRanges,
reclaimedSpace,
startXRefTable,
overheadDeleteStaleHashes,
timedOutErr,
}
}
function getStatePath(contentDir) {
return Path.join(contentDir, '.state.v0.json')
}
class HashFileTracker {
constructor(contentDir, { hashAge = [], hashSize = [] }) {
this.contentDir = contentDir
this.hashAge = new Map(hashAge)
this.hashSize = new Map(hashSize)
}
static async from(contentDir) {
const statePath = getStatePath(contentDir)
let state = {}
try {
const blob = await fs.promises.readFile(statePath)
state = JSON.parse(blob)
} catch (e) {}
return new HashFileTracker(contentDir, state)
}
has(hash) {
return this.hashAge.has(hash)
}
track(range) {
if (!this.hashSize.has(range.hash)) {
this.hashSize.set(range.hash, range.end - range.start)
}
this.hashAge.set(range.hash, 0)
}
updateAge() {
for (const [hash, age] of this.hashAge) {
this.hashAge.set(hash, age + 1)
}
return this
}
findStale(maxAge) {
const stale = []
for (const [hash, age] of this.hashAge) {
if (age > maxAge) {
stale.push(hash)
}
}
return stale
}
async flush() {
const statePath = getStatePath(this.contentDir)
const blob = JSON.stringify({
hashAge: Array.from(this.hashAge.entries()),
hashSize: Array.from(this.hashSize.entries()),
})
const atomicWrite = statePath + '~'
try {
await fs.promises.writeFile(atomicWrite, blob)
} catch (err) {
try {
await fs.promises.unlink(atomicWrite)
} catch (e) {}
throw err
}
try {
await fs.promises.rename(atomicWrite, statePath)
} catch (err) {
try {
await fs.promises.unlink(atomicWrite)
} catch (e) {}
throw err
}
}
async deleteStaleHashes(n) {
const t0 = Date.now()
// delete any hash file older than N generations
const hashes = this.findStale(n)
let reclaimedSpace = 0
if (hashes.length === 0) {
return [reclaimedSpace, Date.now() - t0]
}
await promiseMapWithLimit(10, hashes, async hash => {
try {
await fs.promises.unlink(Path.join(this.contentDir, hash))
} catch (err) {
if (err?.code === 'ENOENT') {
// Ignore already deleted entries. The previous cleanup cycle may have
// been killed halfway through the deletion process, or before we
// flushed the state to disk.
} else {
throw err
}
}
this.hashAge.delete(hash)
reclaimedSpace += this.hashSize.get(hash)
this.hashSize.delete(hash)
})
return [reclaimedSpace, Date.now() - t0]
}
}
function pdfStreamHash(buffer) {
const hash = crypto.createHash('sha256')
hash.update(buffer)
return hash.digest('hex')
}
async function writePdfStream(dir, hash, buffer) {
const filename = Path.join(dir, hash)
const atomicWriteFilename = filename + '~'
try {
await fs.promises.writeFile(atomicWriteFilename, buffer)
await fs.promises.rename(atomicWriteFilename, filename)
} catch (err) {
try {
await fs.promises.unlink(atomicWriteFilename)
} catch (_) {
throw err
}
}
}
function getMaxOverhead(compileTime) {
return Math.min(
// Adding 10s to a 40s compile time is OK.
// Adding 1s to a 3s compile time is OK.
Math.max(compileTime / 4, 1000),
// Adding 30s to a 120s compile time is not OK, limit to 10s.
Settings.pdfCachingMaxProcessingTime
)
}
function getDeadlineChecker(compileTime) {
const timeout = getMaxOverhead(compileTime)
const deadline = Date.now() + timeout
let lastStage = { stage: 'start', now: Date.now() }
let completedStages = 0
return function (stage) {
const now = Date.now()
if (now > deadline) {
throw new TimedOutError(stage, {
timeout,
completedStages,
lastStage: lastStage.stage,
diffToLastStage: now - lastStage.now,
})
}
completedStages++
lastStage = { stage, now }
}
}
function promiseMapWithLimit(concurrency, array, fn) {
const limit = pLimit(concurrency)
return Promise.all(array.map(x => limit(() => fn(x))))
}
module.exports = {
HASH_REGEX: /^[0-9a-f]{64}$/,
update: callbackify(update),
promises: {
update,
updateSameEventLoop,
},
}

View File

@@ -0,0 +1,146 @@
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
const os = require('node:os')
let CACHED_LOAD = {
expires: -1,
load: [0, 0, 0],
}
function getSystemLoad() {
if (CACHED_LOAD.expires < Date.now()) {
CACHED_LOAD = {
expires: Date.now() + 10 * 1000,
load: os.loadavg(),
}
}
return CACHED_LOAD.load
}
const ONE_MB = 1024 * 1024
function emitPdfStats(stats, timings, request) {
if (timings['compute-pdf-caching']) {
emitPdfCachingStats(stats, timings, request)
} else {
// How much bandwidth will the pdf incur when downloaded in full?
Metrics.summary('pdf-bandwidth', stats['pdf-size'], request.metricsOpts)
}
}
function emitPdfCachingStats(stats, timings, request) {
if (!stats['pdf-size']) return // double check
if (stats['pdf-caching-timed-out']) {
Metrics.inc('pdf-caching-timed-out', 1, request.metricsOpts)
}
if (timings['pdf-caching-overhead-delete-stale-hashes'] !== undefined) {
Metrics.summary(
'pdf-caching-overhead-delete-stale-hashes',
timings['pdf-caching-overhead-delete-stale-hashes'],
request.metricsOpts
)
}
// How much extra time did we spent in PDF.js?
Metrics.timing(
'compute-pdf-caching',
timings['compute-pdf-caching'],
1,
request.metricsOpts
)
// How large is the overhead of hashing up-front?
const fraction =
timings.compileE2E - timings['compute-pdf-caching'] !== 0
? timings.compileE2E /
(timings.compileE2E - timings['compute-pdf-caching'])
: 1
if (fraction > 1.5 && timings.compileE2E > 10 * 1000) {
logger.warn(
{
stats,
timings,
load: getSystemLoad(),
},
'slow pdf caching'
)
}
Metrics.summary(
'overhead-compute-pdf-ranges',
fraction * 100 - 100,
request.metricsOpts
)
// How does the hashing scale to pdf size in MB?
Metrics.timing(
'compute-pdf-caching-relative-to-pdf-size',
timings['compute-pdf-caching'] / (stats['pdf-size'] / ONE_MB),
1,
request.metricsOpts
)
if (stats['pdf-caching-total-ranges-size']) {
// How does the hashing scale to total ranges size in MB?
Metrics.timing(
'compute-pdf-caching-relative-to-total-ranges-size',
timings['compute-pdf-caching'] /
(stats['pdf-caching-total-ranges-size'] / ONE_MB),
1,
request.metricsOpts
)
// How fast is the hashing per range on average?
Metrics.timing(
'compute-pdf-caching-relative-to-ranges-count',
timings['compute-pdf-caching'] / stats['pdf-caching-n-ranges'],
1,
request.metricsOpts
)
// How many ranges are new?
Metrics.summary(
'new-pdf-ranges-relative-to-total-ranges',
(stats['pdf-caching-n-new-ranges'] / stats['pdf-caching-n-ranges']) * 100,
request.metricsOpts
)
}
// How much content is cacheable?
Metrics.summary(
'cacheable-ranges-to-pdf-size',
(stats['pdf-caching-total-ranges-size'] / stats['pdf-size']) * 100,
request.metricsOpts
)
const sizeWhenDownloadedInFull =
// All of the pdf
stats['pdf-size'] -
// These ranges are potentially cached.
stats['pdf-caching-total-ranges-size'] +
// These ranges are not cached.
stats['pdf-caching-new-ranges-size']
// How much bandwidth can we save when downloading the pdf in full?
Metrics.summary(
'pdf-bandwidth-savings',
100 - (sizeWhenDownloadedInFull / stats['pdf-size']) * 100,
request.metricsOpts
)
// How much bandwidth will the pdf incur when downloaded in full?
Metrics.summary(
'pdf-bandwidth',
sizeWhenDownloadedInFull,
request.metricsOpts
)
// How much space do the ranges use?
// This will accumulate the ranges size over time, skipping already written ranges.
Metrics.summary(
'pdf-ranges-disk-size',
stats['pdf-caching-new-ranges-size'] - stats['pdf-caching-reclaimed-space'],
request.metricsOpts
)
}
module.exports = {
emitPdfStats,
}

View File

@@ -0,0 +1,4 @@
const workerpool = require('workerpool')
const ContentCacheManager = require('./ContentCacheManager')
workerpool.worker(ContentCacheManager.promises)

View File

@@ -0,0 +1,24 @@
const Path = require('node:path')
const send = require('send')
const Settings = require('@overleaf/settings')
const OutputCacheManager = require('./OutputCacheManager')
const ONE_DAY_S = 24 * 60 * 60
const ONE_DAY_MS = ONE_DAY_S * 1000
function getPdfRange(req, res, next) {
const { projectId, userId, contentId, hash } = req.params
const perUserDir = userId ? `${projectId}-${userId}` : projectId
const path = Path.join(
Settings.path.outputDir,
perUserDir,
OutputCacheManager.CONTENT_SUBDIR,
contentId,
hash
)
res.setHeader('cache-control', `public, max-age=${ONE_DAY_S}`)
res.setHeader('expires', new Date(Date.now() + ONE_DAY_MS).toUTCString())
send(req, path).pipe(res)
}
module.exports = { getPdfRange }

View File

@@ -0,0 +1,38 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
let ContentTypeMapper
const Path = require('node:path')
// here we coerce html, css and js to text/plain,
// otherwise choose correct mime type based on file extension,
// falling back to octet-stream
module.exports = ContentTypeMapper = {
map(path) {
switch (Path.extname(path)) {
case '.txt':
case '.html':
case '.js':
case '.css':
case '.svg':
return 'text/plain'
case '.csv':
return 'text/csv'
case '.pdf':
return 'application/pdf'
case '.png':
return 'image/png'
case '.jpg':
case '.jpeg':
return 'image/jpeg'
case '.tiff':
return 'image/tiff'
case '.gif':
return 'image/gif'
default:
return 'application/octet-stream'
}
},
}

View File

@@ -0,0 +1,110 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let LockManager
const logger = require('@overleaf/logger')
const LockState = {} // locks for docker container operations, by container name
module.exports = LockManager = {
MAX_LOCK_HOLD_TIME: 15000, // how long we can keep a lock
MAX_LOCK_WAIT_TIME: 10000, // how long we wait for a lock
LOCK_TEST_INTERVAL: 1000, // retry time
tryLock(key, callback) {
let lockValue
if (callback == null) {
callback = function () {}
}
const existingLock = LockState[key]
if (existingLock != null) {
// the lock is already taken, check how old it is
const lockAge = Date.now() - existingLock.created
if (lockAge < LockManager.MAX_LOCK_HOLD_TIME) {
return callback(null, false) // we didn't get the lock, bail out
} else {
logger.error(
{ key, lock: existingLock, age: lockAge },
'taking old lock by force'
)
}
}
// take the lock
LockState[key] = lockValue = { created: Date.now() }
return callback(null, true, lockValue)
},
getLock(key, callback) {
let attempt
if (callback == null) {
callback = function () {}
}
const startTime = Date.now()
return (attempt = () =>
LockManager.tryLock(key, function (error, gotLock, lockValue) {
if (error != null) {
return callback(error)
}
if (gotLock) {
return callback(null, lockValue)
} else if (Date.now() - startTime > LockManager.MAX_LOCK_WAIT_TIME) {
const e = new Error('Lock timeout')
e.key = key
return callback(e)
} else {
return setTimeout(attempt, LockManager.LOCK_TEST_INTERVAL)
}
}))()
},
releaseLock(key, lockValue, callback) {
if (callback == null) {
callback = function () {}
}
const existingLock = LockState[key]
if (existingLock === lockValue) {
// lockValue is an object, so we can test by reference
delete LockState[key] // our lock, so we can free it
return callback()
} else if (existingLock != null) {
// lock exists but doesn't match ours
logger.error(
{ key, lock: existingLock },
'tried to release lock taken by force'
)
return callback()
} else {
logger.error(
{ key, lock: existingLock },
'tried to release lock that has gone'
)
return callback()
}
},
runWithLock(key, runner, callback) {
if (callback == null) {
callback = function () {}
}
return LockManager.getLock(key, function (error, lockValue) {
if (error != null) {
return callback(error)
}
return runner((error1, ...args) =>
LockManager.releaseLock(key, lockValue, function (error2) {
error = error1 || error2
if (error != null) {
return callback(error)
}
return callback(null, ...Array.from(args))
})
)
})
},
}

View File

@@ -0,0 +1,597 @@
const { promisify } = require('node:util')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const Docker = require('dockerode')
const dockerode = new Docker()
const crypto = require('node:crypto')
const async = require('async')
const LockManager = require('./DockerLockManager')
const Path = require('node:path')
const _ = require('lodash')
const ONE_HOUR_IN_MS = 60 * 60 * 1000
logger.debug('using docker runner')
let containerMonitorTimeout
let containerMonitorInterval
const DockerRunner = {
run(
projectId,
command,
directory,
image,
timeout,
environment,
compileGroup,
callback
) {
command = command.map(arg =>
arg.toString().replace('$COMPILE_DIR', '/compile')
)
if (image == null) {
image = Settings.clsi.docker.image
}
if (
Settings.clsi.docker.allowedImages &&
!Settings.clsi.docker.allowedImages.includes(image)
) {
return callback(new Error('image not allowed'))
}
if (Settings.texliveImageNameOveride != null) {
const img = image.split('/')
image = `${Settings.texliveImageNameOveride}/${img[2]}`
}
if (compileGroup === 'synctex-output') {
// In: directory = '/overleaf/services/clsi/output/projectId-userId/generated-files/buildId'
// directory.split('/').slice(-3) === 'projectId-userId/generated-files/buildId'
// sandboxedCompilesHostDirOutput = '/host/output'
// Out: directory = '/host/output/projectId-userId/generated-files/buildId'
directory = Path.join(
Settings.path.sandboxedCompilesHostDirOutput,
...directory.split('/').slice(-3)
)
} else {
// In: directory = '/overleaf/services/clsi/compiles/projectId-userId'
// Path.basename(directory) === 'projectId-userId'
// sandboxedCompilesHostDirCompiles = '/host/compiles'
// Out: directory = '/host/compiles/projectId-userId'
directory = Path.join(
Settings.path.sandboxedCompilesHostDirCompiles,
Path.basename(directory)
)
}
const volumes = { [directory]: '/compile' }
if (
compileGroup === 'synctex' ||
compileGroup === 'synctex-output' ||
compileGroup === 'wordcount'
) {
volumes[directory] += ':ro'
}
const options = DockerRunner._getContainerOptions(
command,
image,
volumes,
timeout,
environment,
compileGroup
)
const fingerprint = DockerRunner._fingerprintContainer(options)
const name = `project-${projectId}-${fingerprint}`
options.name = name
// logOptions = _.clone(options)
// logOptions?.HostConfig?.SecurityOpt = "secomp used, removed in logging"
logger.debug({ projectId }, 'running docker container')
DockerRunner._runAndWaitForContainer(
options,
volumes,
timeout,
(error, output) => {
if (error && error.statusCode === 500) {
logger.debug(
{ err: error, projectId },
'error running container so destroying and retrying'
)
DockerRunner.destroyContainer(name, null, true, error => {
if (error != null) {
return callback(error)
}
DockerRunner._runAndWaitForContainer(
options,
volumes,
timeout,
callback
)
})
} else {
callback(error, output)
}
}
)
// pass back the container name to allow it to be killed
return name
},
kill(containerId, callback) {
logger.debug({ containerId }, 'sending kill signal to container')
const container = dockerode.getContainer(containerId)
container.kill(error => {
if (
error != null &&
error.message != null &&
error.message.match(/Cannot kill container .* is not running/)
) {
logger.warn(
{ err: error, containerId },
'container not running, continuing'
)
error = null
}
if (error != null) {
logger.error({ err: error, containerId }, 'error killing container')
callback(error)
} else {
callback()
}
})
},
_runAndWaitForContainer(options, volumes, timeout, _callback) {
const callback = _.once(_callback)
const { name } = options
let streamEnded = false
let containerReturned = false
let output = {}
function callbackIfFinished() {
if (streamEnded && containerReturned) {
callback(null, output)
}
}
function attachStreamHandler(error, _output) {
if (error != null) {
return callback(error)
}
output = _output
streamEnded = true
callbackIfFinished()
}
DockerRunner.startContainer(
options,
volumes,
attachStreamHandler,
(error, containerId) => {
if (error != null) {
return callback(error)
}
DockerRunner.waitForContainer(name, timeout, (error, exitCode) => {
if (error != null) {
return callback(error)
}
if (exitCode === 137) {
// exit status from kill -9
const err = new Error('terminated')
err.terminated = true
return callback(err)
}
if (exitCode === 1) {
// exit status from chktex
const err = new Error('exited')
err.code = exitCode
return callback(err)
}
containerReturned = true
if (options != null && options.HostConfig != null) {
options.HostConfig.SecurityOpt = null
}
logger.debug({ exitCode, options }, 'docker container has exited')
callbackIfFinished()
})
}
)
},
_getContainerOptions(
command,
image,
volumes,
timeout,
environment,
compileGroup
) {
const timeoutInSeconds = timeout / 1000
const dockerVolumes = {}
for (const hostVol in volumes) {
const dockerVol = volumes[hostVol]
dockerVolumes[dockerVol] = {}
if (volumes[hostVol].slice(-3).indexOf(':r') === -1) {
volumes[hostVol] = `${dockerVol}:rw`
}
}
// merge settings and environment parameter
const env = {}
for (const src of [Settings.clsi.docker.env, environment || {}]) {
for (const key in src) {
const value = src[key]
env[key] = value
}
}
// set the path based on the image year
const match = image.match(/:([0-9]+)\.[0-9]+/)
const year = match ? match[1] : '2014'
env.PATH = `/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/texlive/${year}/bin/x86_64-linux/`
const options = {
Cmd: command,
Image: image,
Volumes: dockerVolumes,
WorkingDir: '/compile',
NetworkDisabled: true,
Memory: 1024 * 1024 * 1024 * 1024, // 1 Gb
User: Settings.clsi.docker.user,
Env: Object.entries(env).map(([key, value]) => `${key}=${value}`),
HostConfig: {
Binds: Object.entries(volumes).map(
([hostVol, dockerVol]) => `${hostVol}:${dockerVol}`
),
LogConfig: { Type: 'none', Config: {} },
Ulimits: [
{
Name: 'cpu',
Soft: timeoutInSeconds + 5,
Hard: timeoutInSeconds + 10,
},
],
CapDrop: 'ALL',
SecurityOpt: ['no-new-privileges'],
},
}
if (Settings.clsi.docker.seccomp_profile != null) {
options.HostConfig.SecurityOpt.push(
`seccomp=${Settings.clsi.docker.seccomp_profile}`
)
}
if (Settings.clsi.docker.apparmor_profile != null) {
options.HostConfig.SecurityOpt.push(
`apparmor=${Settings.clsi.docker.apparmor_profile}`
)
}
if (Settings.clsi.docker.runtime) {
options.HostConfig.Runtime = Settings.clsi.docker.runtime
}
if (Settings.clsi.docker.Readonly) {
options.HostConfig.ReadonlyRootfs = true
options.HostConfig.Tmpfs = { '/tmp': 'rw,noexec,nosuid,size=65536k' }
options.Volumes['/home/tex'] = {}
}
// Allow per-compile group overriding of individual settings
if (
Settings.clsi.docker.compileGroupConfig &&
Settings.clsi.docker.compileGroupConfig[compileGroup]
) {
const override = Settings.clsi.docker.compileGroupConfig[compileGroup]
for (const key in override) {
_.set(options, key, override[key])
}
}
return options
},
_fingerprintContainer(containerOptions) {
// Yay, Hashing!
const json = JSON.stringify(containerOptions)
return crypto.createHash('md5').update(json).digest('hex')
},
startContainer(options, volumes, attachStreamHandler, callback) {
LockManager.runWithLock(
options.name,
releaseLock =>
DockerRunner._startContainer(
options,
volumes,
attachStreamHandler,
releaseLock
),
callback
)
},
// Check that volumes exist and are directories
_startContainer(options, volumes, attachStreamHandler, callback) {
callback = _.once(callback)
const { name } = options
logger.debug({ containerName: name }, 'starting container')
const container = dockerode.getContainer(name)
function createAndStartContainer() {
dockerode.createContainer(options, (error, container) => {
if (error != null) {
return callback(error)
}
startExistingContainer()
})
}
function startExistingContainer() {
DockerRunner.attachToContainer(
options.name,
attachStreamHandler,
error => {
if (error != null) {
return callback(error)
}
container.start(error => {
if (error != null && error.statusCode !== 304) {
callback(error)
} else {
// already running
callback()
}
})
}
)
}
container.inspect((error, stats) => {
if (error != null && error.statusCode === 404) {
createAndStartContainer()
} else if (error != null) {
logger.err(
{ containerName: name, error },
'unable to inspect container to start'
)
callback(error)
} else {
startExistingContainer()
}
})
},
attachToContainer(containerId, attachStreamHandler, attachStartCallback) {
const container = dockerode.getContainer(containerId)
container.attach({ stdout: 1, stderr: 1, stream: 1 }, (error, stream) => {
if (error != null) {
logger.error(
{ err: error, containerId },
'error attaching to container'
)
return attachStartCallback(error)
} else {
attachStartCallback()
}
logger.debug({ containerId }, 'attached to container')
const MAX_OUTPUT = 1024 * 1024 * 2 // limit output to 2MB
function createStringOutputStream(name) {
return {
data: '',
overflowed: false,
write(data) {
if (this.overflowed) {
return
}
if (this.data.length < MAX_OUTPUT) {
this.data += data
} else {
logger.info(
{
containerId,
length: this.data.length,
maxLen: MAX_OUTPUT,
},
`${name} exceeds max size`
)
this.data += `(...truncated at ${MAX_OUTPUT} chars...)`
this.overflowed = true
}
},
// kill container if too much output
// docker.containers.kill(containerId, () ->)
}
}
const stdout = createStringOutputStream('stdout')
const stderr = createStringOutputStream('stderr')
container.modem.demuxStream(stream, stdout, stderr)
stream.on('error', err =>
logger.error(
{ err, containerId },
'error reading from container stream'
)
)
stream.on('end', () =>
attachStreamHandler(null, { stdout: stdout.data, stderr: stderr.data })
)
})
},
waitForContainer(containerId, timeout, _callback) {
const callback = _.once(_callback)
const container = dockerode.getContainer(containerId)
let timedOut = false
const timeoutId = setTimeout(() => {
timedOut = true
logger.debug({ containerId }, 'timeout reached, killing container')
container.kill(err => {
logger.warn({ err, containerId }, 'failed to kill container')
})
}, timeout)
logger.debug({ containerId }, 'waiting for docker container')
container.wait((error, res) => {
if (error != null) {
clearTimeout(timeoutId)
logger.warn({ err: error, containerId }, 'error waiting for container')
return callback(error)
}
if (timedOut) {
logger.debug({ containerId }, 'docker container timed out')
error = new Error('container timed out')
error.timedout = true
callback(error)
} else {
clearTimeout(timeoutId)
logger.debug(
{ containerId, exitCode: res.StatusCode },
'docker container returned'
)
callback(null, res.StatusCode)
}
})
},
destroyContainer(containerName, containerId, shouldForce, callback) {
// We want the containerName for the lock and, ideally, the
// containerId to delete. There is a bug in the docker.io module
// where if you delete by name and there is an error, it throws an
// async exception, but if you delete by id it just does a normal
// error callback. We fall back to deleting by name if no id is
// supplied.
LockManager.runWithLock(
containerName,
releaseLock =>
DockerRunner._destroyContainer(
containerId || containerName,
shouldForce,
releaseLock
),
callback
)
},
_destroyContainer(containerId, shouldForce, callback) {
logger.debug({ containerId }, 'destroying docker container')
const container = dockerode.getContainer(containerId)
container.remove({ force: shouldForce === true, v: true }, error => {
if (error != null && error.statusCode === 404) {
logger.warn(
{ err: error, containerId },
'container not found, continuing'
)
error = null
}
if (error != null) {
logger.error({ err: error, containerId }, 'error destroying container')
} else {
logger.debug({ containerId }, 'destroyed container')
}
callback(error)
})
},
// handle expiry of docker containers
MAX_CONTAINER_AGE: Settings.clsi.docker.maxContainerAge || ONE_HOUR_IN_MS,
examineOldContainer(container, callback) {
const name = container.Name || (container.Names && container.Names[0])
const created = container.Created * 1000 // creation time is returned in seconds
const now = Date.now()
const age = now - created
const maxAge = DockerRunner.MAX_CONTAINER_AGE
const ttl = maxAge - age
logger.debug(
{ containerName: name, created, now, age, maxAge, ttl },
'checking whether to destroy container'
)
return { name, id: container.Id, ttl }
},
destroyOldContainers(callback) {
dockerode.listContainers({ all: true }, (error, containers) => {
if (error != null) {
return callback(error)
}
const jobs = []
for (const container of containers) {
const { name, id, ttl } = DockerRunner.examineOldContainer(container)
if (name.slice(0, 9) === '/project-' && ttl <= 0) {
// strip the / prefix
// the LockManager uses the plain container name
const plainName = name.slice(1)
jobs.push(cb =>
DockerRunner.destroyContainer(plainName, id, false, () => cb())
)
}
}
// Ignore errors because some containers get stuck but
// will be destroyed next time
async.series(jobs, callback)
})
},
startContainerMonitor() {
logger.debug(
{ maxAge: DockerRunner.MAX_CONTAINER_AGE },
'starting container expiry'
)
// guarantee only one monitor is running
DockerRunner.stopContainerMonitor()
// randomise the start time
const randomDelay = Math.floor(Math.random() * 5 * 60 * 1000)
containerMonitorTimeout = setTimeout(() => {
containerMonitorInterval = setInterval(
() =>
DockerRunner.destroyOldContainers(err => {
if (err) {
logger.error({ err }, 'failed to destroy old containers')
}
}),
ONE_HOUR_IN_MS
)
}, randomDelay)
},
stopContainerMonitor() {
if (containerMonitorTimeout) {
clearTimeout(containerMonitorTimeout)
containerMonitorTimeout = undefined
}
if (containerMonitorInterval) {
clearInterval(containerMonitorInterval)
containerMonitorInterval = undefined
}
},
canRunSyncTeXInOutputDir() {
return Boolean(Settings.path.sandboxedCompilesHostDirOutput)
},
}
DockerRunner.startContainerMonitor()
module.exports = DockerRunner
module.exports.promises = {
run: promisify(DockerRunner.run),
kill: promisify(DockerRunner.kill),
}

View File

@@ -0,0 +1,24 @@
const fsPromises = require('node:fs/promises')
const { callbackify } = require('node:util')
const logger = require('@overleaf/logger')
async function injectDraftMode(filename) {
const content = await fsPromises.readFile(filename, { encoding: 'utf8' })
const modifiedContent =
'\\PassOptionsToPackage{draft}{graphicx}\\PassOptionsToPackage{draft}{graphics}' +
content
logger.debug(
{
content: content.slice(0, 1024), // \documentclass is normally v near the top
modifiedContent: modifiedContent.slice(0, 1024),
filename,
},
'injected draft class'
)
await fsPromises.writeFile(filename, modifiedContent, { encoding: 'utf8' })
}
module.exports = {
injectDraftMode: callbackify(injectDraftMode),
promises: { injectDraftMode },
}

View File

@@ -0,0 +1,49 @@
/* eslint-disable
no-proto,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
const OError = require('@overleaf/o-error')
let Errors
function NotFoundError(message) {
const error = new Error(message)
error.name = 'NotFoundError'
error.__proto__ = NotFoundError.prototype
return error
}
NotFoundError.prototype.__proto__ = Error.prototype
function FilesOutOfSyncError(message) {
const error = new Error(message)
error.name = 'FilesOutOfSyncError'
error.__proto__ = FilesOutOfSyncError.prototype
return error
}
FilesOutOfSyncError.prototype.__proto__ = Error.prototype
function AlreadyCompilingError(message) {
const error = new Error(message)
error.name = 'AlreadyCompilingError'
error.__proto__ = AlreadyCompilingError.prototype
return error
}
AlreadyCompilingError.prototype.__proto__ = Error.prototype
class QueueLimitReachedError extends OError {}
class TimedOutError extends OError {}
class NoXrefTableError extends OError {}
class TooManyCompileRequestsError extends OError {}
class InvalidParameter extends OError {}
module.exports = Errors = {
QueueLimitReachedError,
TimedOutError,
NotFoundError,
FilesOutOfSyncError,
AlreadyCompilingError,
NoXrefTableError,
TooManyCompileRequestsError,
InvalidParameter,
}

View File

@@ -0,0 +1,203 @@
const Path = require('node:path')
const { promisify } = require('node:util')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const CommandRunner = require('./CommandRunner')
const fs = require('node:fs')
const ProcessTable = {} // table of currently running jobs (pids or docker container names)
const TIME_V_METRICS = Object.entries({
'cpu-percent': /Percent of CPU this job got: (\d+)/m,
'cpu-time': /User time.*: (\d+.\d+)/m,
'sys-time': /System time.*: (\d+.\d+)/m,
})
const COMPILER_FLAGS = {
latex: '-pdfdvi',
lualatex: '-lualatex',
pdflatex: '-pdf',
xelatex: '-xelatex',
}
function runLatex(projectId, options, callback) {
const {
directory,
mainFile,
image,
environment,
flags,
compileGroup,
stopOnFirstError,
stats,
timings,
} = options
const compiler = options.compiler || 'pdflatex'
const timeout = options.timeout || 60000 // milliseconds
logger.debug(
{
directory,
compiler,
timeout,
mainFile,
environment,
flags,
compileGroup,
stopOnFirstError,
},
'starting compile'
)
let command
try {
command = _buildLatexCommand(mainFile, {
compiler,
stopOnFirstError,
flags,
})
} catch (err) {
return callback(err)
}
const id = `${projectId}` // record running project under this id
ProcessTable[id] = CommandRunner.run(
projectId,
command,
directory,
image,
timeout,
environment,
compileGroup,
function (error, output) {
delete ProcessTable[id]
if (error) {
return callback(error)
}
const runs =
output?.stderr?.match(/^Run number \d+ of .*latex/gm)?.length || 0
const failed = output?.stdout?.match(/^Latexmk: Errors/m) != null ? 1 : 0
// counters from latexmk output
stats['latexmk-errors'] = failed
stats['latex-runs'] = runs
stats['latex-runs-with-errors'] = failed ? runs : 0
stats[`latex-runs-${runs}`] = 1
stats[`latex-runs-with-errors-${runs}`] = failed ? 1 : 0
// timing information from /usr/bin/time
const stderr = (output && output.stderr) || ''
if (stderr.includes('Command being timed:')) {
// Add metrics for runs with `$ time -v ...`
for (const [timing, matcher] of TIME_V_METRICS) {
const match = stderr.match(matcher)
if (match) {
timings[timing] = parseFloat(match[1])
}
}
}
// record output files
_writeLogOutput(projectId, directory, output, () => {
callback(error, output)
})
}
)
}
function _writeLogOutput(projectId, directory, output, callback) {
if (!output) {
return callback()
}
// internal method for writing non-empty log files
function _writeFile(file, content, cb) {
if (content && content.length > 0) {
fs.unlink(file, () => {
fs.writeFile(file, content, { flag: 'wx' }, err => {
if (err) {
// don't fail on error
logger.error({ err, projectId, file }, 'error writing log file')
}
cb()
})
})
} else {
cb()
}
}
// write stdout and stderr, ignoring errors
_writeFile(Path.join(directory, 'output.stdout'), output.stdout, () => {
_writeFile(Path.join(directory, 'output.stderr'), output.stderr, () => {
callback()
})
})
}
function killLatex(projectId, callback) {
const id = `${projectId}`
logger.debug({ id }, 'killing running compile')
if (ProcessTable[id] == null) {
logger.warn({ id }, 'no such project to kill')
callback(null)
} else {
CommandRunner.kill(ProcessTable[id], callback)
}
}
function _buildLatexCommand(mainFile, opts = {}) {
const command = []
if (Settings.clsi?.strace) {
command.push('strace', '-o', 'strace', '-ff')
}
if (Settings.clsi?.latexmkCommandPrefix) {
command.push(...Settings.clsi.latexmkCommandPrefix)
}
// Basic command and flags
command.push(
'latexmk',
'-cd',
'-jobname=output',
'-auxdir=$COMPILE_DIR',
'-outdir=$COMPILE_DIR',
'-synctex=1',
'-interaction=batchmode'
)
// Stop on first error option
if (opts.stopOnFirstError) {
command.push('-halt-on-error')
} else {
// Run all passes despite errors
command.push('-f')
}
// Extra flags
if (opts.flags) {
command.push(...opts.flags)
}
// TeX Engine selection
const compilerFlag = COMPILER_FLAGS[opts.compiler]
if (compilerFlag) {
command.push(compilerFlag)
} else {
throw new Error(`unknown compiler: ${opts.compiler}`)
}
// We want to run latexmk on the tex file which we will automatically
// generate from the Rtex/Rmd/md file.
mainFile = mainFile.replace(/\.(Rtex|md|Rmd|Rnw)$/, '.tex')
command.push(Path.join('$COMPILE_DIR', mainFile))
return command
}
module.exports = {
runLatex,
killLatex,
promises: {
runLatex: promisify(runLatex),
killLatex: promisify(killLatex),
},
}

View File

@@ -0,0 +1,111 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let CommandRunner
const { spawn } = require('node:child_process')
const { promisify } = require('node:util')
const _ = require('lodash')
const logger = require('@overleaf/logger')
logger.debug('using standard command runner')
module.exports = CommandRunner = {
run(
projectId,
command,
directory,
image,
timeout,
environment,
compileGroup,
callback
) {
let key, value
callback = _.once(callback)
command = Array.from(command).map(arg =>
arg.toString().replace('$COMPILE_DIR', directory)
)
logger.debug({ projectId, command, directory }, 'running command')
logger.warn('timeouts and sandboxing are not enabled with CommandRunner')
// merge environment settings
const env = {}
for (key in process.env) {
value = process.env[key]
env[key] = value
}
for (key in environment) {
value = environment[key]
env[key] = value
}
// run command as detached process so it has its own process group (which can be killed if needed)
const proc = spawn(command[0], command.slice(1), {
cwd: directory,
env,
stdio: ['pipe', 'pipe', 'ignore'],
})
let stdout = ''
proc.stdout.setEncoding('utf8').on('data', data => (stdout += data))
proc.on('error', function (err) {
logger.err(
{ err, projectId, command, directory },
'error running command'
)
return callback(err)
})
proc.on('close', function (code, signal) {
let err
logger.debug({ code, signal, projectId }, 'command exited')
if (signal === 'SIGTERM') {
// signal from kill method below
err = new Error('terminated')
err.terminated = true
return callback(err)
} else if (code === 1) {
// exit status from chktex
err = new Error('exited')
err.code = code
return callback(err)
} else {
return callback(null, { stdout })
}
})
return proc.pid
}, // return process id to allow job to be killed if necessary
kill(pid, callback) {
if (callback == null) {
callback = function () {}
}
try {
process.kill(-pid) // kill all processes in group
} catch (err) {
return callback(err)
}
return callback()
},
canRunSyncTeXInOutputDir() {
return true
},
}
module.exports.promises = {
run: promisify(CommandRunner.run),
kill: promisify(CommandRunner.kill),
}

View File

@@ -0,0 +1,66 @@
const logger = require('@overleaf/logger')
const Errors = require('./Errors')
const RequestParser = require('./RequestParser')
const Metrics = require('@overleaf/metrics')
const Settings = require('@overleaf/settings')
// The lock timeout should be higher than the maximum end-to-end compile time.
// Here, we use the maximum compile timeout plus 2 minutes.
const LOCK_TIMEOUT_MS = RequestParser.MAX_TIMEOUT * 1000 + 120000
const LOCKS = new Map()
function acquire(key) {
const currentLock = LOCKS.get(key)
if (currentLock != null) {
if (currentLock.isExpired()) {
logger.warn({ key }, 'Compile lock expired')
currentLock.release()
} else {
throw new Errors.AlreadyCompilingError('compile in progress')
}
}
checkConcurrencyLimit()
const lock = new Lock(key)
LOCKS.set(key, lock)
return lock
}
function checkConcurrencyLimit() {
Metrics.gauge('concurrent_compile_requests', LOCKS.size)
if (LOCKS.size <= Settings.compileConcurrencyLimit) {
return
}
Metrics.inc('exceeded-compilier-concurrency-limit')
throw new Errors.TooManyCompileRequestsError(
'too many concurrent compile requests'
)
}
class Lock {
constructor(key) {
this.key = key
this.expiresAt = Date.now() + LOCK_TIMEOUT_MS
}
isExpired() {
return Date.now() >= this.expiresAt
}
release() {
const lockWasActive = LOCKS.delete(this.key)
if (!lockWasActive) {
logger.error({ key: this.key }, 'Lock was released twice')
}
if (this.isExpired()) {
Metrics.inc('compile_lock_expired_before_release')
}
}
}
module.exports = { acquire }

View File

@@ -0,0 +1,3 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
module.exports = require('@overleaf/metrics')

View File

@@ -0,0 +1,688 @@
let OutputCacheManager
const { callbackify, promisify } = require('node:util')
const async = require('async')
const fs = require('node:fs')
const Path = require('node:path')
const logger = require('@overleaf/logger')
const _ = require('lodash')
const Settings = require('@overleaf/settings')
const crypto = require('node:crypto')
const Metrics = require('./Metrics')
const OutputFileOptimiser = require('./OutputFileOptimiser')
const ContentCacheManager = require('./ContentCacheManager')
const {
QueueLimitReachedError,
TimedOutError,
NoXrefTableError,
} = require('./Errors')
const OLDEST_BUILD_DIR = new Map()
const PENDING_PROJECT_ACTIONS = new Map()
function init() {
doInit().catch(err => {
logger.fatal({ err }, 'low level error setting up cleanup of output dir')
// consider shutting down?
})
}
async function doInit() {
await fillCache()
const oldestTimestamp = await runBulkCleanup()
scheduleBulkCleanup(oldestTimestamp)
}
function scheduleBulkCleanup(oldestTimestamp) {
const delay =
Math.max(OutputCacheManager.CACHE_AGE + oldestTimestamp - Date.now(), 0) +
60 * 1000
setTimeout(async function () {
const oldestTimestamp = await runBulkCleanup()
scheduleBulkCleanup(oldestTimestamp)
}, delay)
}
async function fillCache() {
const handle = await fs.promises.opendir(Settings.path.outputDir)
try {
for await (const { name: projectIdAndUserId } of handle) {
OLDEST_BUILD_DIR.set(
Path.join(Settings.path.outputDir, projectIdAndUserId),
// Queue them for cleanup in the next hour.
Date.now() - Math.random() * OutputCacheManager.CACHE_AGE
)
}
} finally {
try {
await handle.close()
} catch (e) {}
}
}
async function runBulkCleanup() {
const cleanupThreshold = Date.now() - OutputCacheManager.CACHE_AGE
let oldestTimestamp = Date.now()
for (const [dir, timeStamp] of OLDEST_BUILD_DIR.entries()) {
if (timeStamp < cleanupThreshold) {
await cleanupDirectory(dir, { limit: OutputCacheManager.CACHE_LIMIT })
} else if (timeStamp < oldestTimestamp) {
oldestTimestamp = timeStamp
}
}
return oldestTimestamp
}
async function cleanupDirectory(dir, options) {
return await queueDirOperation(dir, async () => {
try {
await OutputCacheManager.promises.expireOutputFiles(dir, options)
} catch (err) {
logger.err({ dir, err }, 'cleanup of output directory failed')
}
})
}
/**
* @template T
*
* @param {string} dir
* @param {() => Promise<T>} fn
* @return {Promise<T>}
*/
async function queueDirOperation(dir, fn) {
const pending = PENDING_PROJECT_ACTIONS.get(dir) || Promise.resolve()
const p = pending.then(fn, fn).finally(() => {
if (PENDING_PROJECT_ACTIONS.get(dir) === p) {
PENDING_PROJECT_ACTIONS.delete(dir)
}
})
PENDING_PROJECT_ACTIONS.set(dir, p)
return p
}
module.exports = OutputCacheManager = {
CONTENT_SUBDIR: 'content',
CACHE_SUBDIR: 'generated-files',
ARCHIVE_SUBDIR: 'archived-logs',
// build id is HEXDATE-HEXRANDOM from Date.now() and RandomBytes
BUILD_REGEX: /^[0-9a-f]+-[0-9a-f]+$/,
CONTENT_REGEX: /^[0-9a-f]+-[0-9a-f]+$/,
CACHE_LIMIT: 2, // maximum number of cache directories
CACHE_AGE: 90 * 60 * 1000, // up to 90 minutes old
init,
queueDirOperation: callbackify(queueDirOperation),
path(buildId, file) {
// used by static server, given build id return '.cache/clsi/buildId'
if (buildId.match(OutputCacheManager.BUILD_REGEX)) {
return Path.join(OutputCacheManager.CACHE_SUBDIR, buildId, file)
} else {
// for invalid build id, return top level
return file
}
},
generateBuildId(callback) {
// generate a secure build id from Date.now() and 8 random bytes in hex
crypto.randomBytes(8, function (err, buf) {
if (err) {
return callback(err)
}
const random = buf.toString('hex')
const date = Date.now().toString(16)
callback(err, `${date}-${random}`)
})
},
saveOutputFiles(
{ request, stats, timings },
outputFiles,
compileDir,
outputDir,
callback
) {
const getBuildId = cb => {
if (request.buildId) return cb(null, request.buildId)
OutputCacheManager.generateBuildId(cb)
}
getBuildId(function (err, buildId) {
if (err) {
return callback(err)
}
if (!OLDEST_BUILD_DIR.has(outputDir)) {
// Register for cleanup
OLDEST_BUILD_DIR.set(outputDir, Date.now())
}
OutputCacheManager.queueDirOperation(
outputDir,
() =>
OutputCacheManager.promises.saveOutputFilesInBuildDir(
outputFiles,
compileDir,
outputDir,
buildId
),
function (err, result) {
if (err) {
return callback(err)
}
OutputCacheManager.collectOutputPdfSize(
result,
outputDir,
stats,
(err, outputFiles) => {
if (err) return callback(err, { outputFiles, buildId })
const enablePdfCaching = request.enablePdfCaching
const enablePdfCachingDark =
Settings.enablePdfCachingDark && !request.enablePdfCaching
if (
!Settings.enablePdfCaching ||
(!enablePdfCaching && !enablePdfCachingDark)
) {
return callback(null, { outputFiles, buildId })
}
OutputCacheManager.saveStreamsInContentDir(
{ request, stats, timings, enablePdfCachingDark },
outputFiles,
compileDir,
outputDir,
(err, status) => {
Metrics.inc('pdf-caching-status', 1, {
status,
...request.metricsOpts,
})
if (err) {
logger.warn(
{ err, outputDir, stats, timings },
'pdf caching failed'
)
return callback(null, { outputFiles, buildId })
}
callback(err, { outputFiles, buildId })
}
)
}
)
}
)
})
},
saveOutputFilesInBuildDir(
outputFiles,
compileDir,
outputDir,
buildId,
callback
) {
// make a compileDir/CACHE_SUBDIR/build_id directory and
// copy all the output files into it
// Put the files into a new cache subdirectory
const cacheDir = Path.join(
outputDir,
OutputCacheManager.CACHE_SUBDIR,
buildId
)
// Is it a per-user compile? check if compile directory is PROJECTID-USERID
const perUser = Path.basename(compileDir).match(
/^[0-9a-f]{24}-[0-9a-f]{24}$/
)
// Archive logs in background
if (Settings.clsi?.archive_logs || Settings.clsi?.strace) {
OutputCacheManager.archiveLogs(
outputFiles,
compileDir,
outputDir,
buildId,
function (err) {
if (err) {
return logger.warn({ err }, 'erroring archiving log files')
}
}
)
}
// make the new cache directory
fs.mkdir(cacheDir, { recursive: true }, function (err) {
if (err) {
logger.error(
{ err, directory: cacheDir },
'error creating cache directory'
)
callback(err)
} else {
// copy all the output files into the new cache directory
const results = []
const dirCache = new Set()
dirCache.add(cacheDir)
async.mapSeries(
outputFiles,
function (file, cb) {
// don't send dot files as output, express doesn't serve them
if (OutputCacheManager._fileIsHidden(file.path)) {
logger.debug(
{ compileDir, path: file.path },
'ignoring dotfile in output'
)
return cb()
}
// copy other files into cache directory if valid
const src = Path.join(compileDir, file.path)
const dst = Path.join(cacheDir, file.path)
OutputCacheManager._checkIfShouldCopy(
src,
function (err, shouldCopy) {
if (err) {
return cb(err)
}
if (!shouldCopy) {
return cb()
}
OutputCacheManager._copyFile(src, dst, dirCache, err => {
if (err) {
return cb(err)
}
file.build = buildId
results.push(file)
cb()
})
}
)
},
function (err) {
if (err) {
callback(err)
// clean up the directory we just created
fs.rm(cacheDir, { force: true, recursive: true }, function (err) {
if (err) {
return logger.error(
{ err, dir: cacheDir },
'error removing cache dir after failure'
)
}
})
} else {
// pass back the list of new files in the cache
callback(null, results)
// let file expiry run in the background, expire all previous files if per-user
cleanupDirectory(outputDir, {
keep: buildId,
limit: perUser ? 1 : null,
}).catch(() => {})
}
}
)
}
})
},
collectOutputPdfSize(outputFiles, outputDir, stats, callback) {
const outputFile = outputFiles.find(x => x.path === 'output.pdf')
if (!outputFile) return callback(null, outputFiles)
const outputFilePath = Path.join(
outputDir,
OutputCacheManager.path(outputFile.build, outputFile.path)
)
fs.stat(outputFilePath, (err, stat) => {
if (err) return callback(err, outputFiles)
outputFile.size = stat.size
stats['pdf-size'] = outputFile.size
callback(null, outputFiles)
})
},
saveStreamsInContentDir(
{ request, stats, timings, enablePdfCachingDark },
outputFiles,
compileDir,
outputDir,
callback
) {
const cacheRoot = Path.join(outputDir, OutputCacheManager.CONTENT_SUBDIR)
// check if content dir exists
OutputCacheManager.ensureContentDir(cacheRoot, function (err, contentDir) {
if (err) return callback(err, 'content-dir-unavailable')
const outputFile = outputFiles.find(x => x.path === 'output.pdf')
if (outputFile) {
// possibly we should copy the file from the build dir here
const outputFilePath = Path.join(
outputDir,
OutputCacheManager.path(outputFile.build, outputFile.path)
)
const pdfSize = outputFile.size
const timer = new Metrics.Timer(
'compute-pdf-ranges',
1,
request.metricsOpts
)
ContentCacheManager.update(
{
contentDir,
filePath: outputFilePath,
pdfSize,
pdfCachingMinChunkSize: request.pdfCachingMinChunkSize,
compileTime: timings.compile,
},
function (err, result) {
if (err && err instanceof NoXrefTableError) {
return callback(null, err.message)
}
if (err && err instanceof QueueLimitReachedError) {
logger.warn({ err, outputDir }, 'pdf caching queue limit reached')
stats['pdf-caching-queue-limit-reached'] = 1
return callback(null, 'queue-limit')
}
if (err && err instanceof TimedOutError) {
logger.warn(
{ err, outputDir, stats, timings },
'pdf caching timed out'
)
stats['pdf-caching-timed-out'] = 1
return callback(null, 'timed-out')
}
if (err) return callback(err, 'failed')
const {
contentRanges,
newContentRanges,
reclaimedSpace,
overheadDeleteStaleHashes,
timedOutErr,
startXRefTable,
} = result
let status = 'success'
if (timedOutErr) {
// Soft failure: let the frontend use partial set of ranges.
logger.warn(
{
err: timedOutErr,
overheadDeleteStaleHashes,
outputDir,
stats,
timings,
},
'pdf caching timed out - soft failure'
)
stats['pdf-caching-timed-out'] = 1
status = 'timed-out-soft-failure'
}
if (enablePdfCachingDark) {
// In dark mode we are doing the computation only and do not emit
// any ranges to the frontend.
} else {
outputFile.contentId = Path.basename(contentDir)
outputFile.ranges = contentRanges
outputFile.startXRefTable = startXRefTable
}
timings['compute-pdf-caching'] = timer.done()
stats['pdf-caching-n-ranges'] = contentRanges.length
stats['pdf-caching-total-ranges-size'] = contentRanges.reduce(
(sum, next) => sum + (next.end - next.start),
0
)
stats['pdf-caching-n-new-ranges'] = newContentRanges.length
stats['pdf-caching-new-ranges-size'] = newContentRanges.reduce(
(sum, next) => sum + (next.end - next.start),
0
)
stats['pdf-caching-reclaimed-space'] = reclaimedSpace
timings['pdf-caching-overhead-delete-stale-hashes'] =
overheadDeleteStaleHashes
callback(null, status)
}
)
} else {
callback(null, 'missing-pdf')
}
})
},
ensureContentDir(contentRoot, callback) {
fs.mkdir(contentRoot, { recursive: true }, function (err) {
if (err) {
return callback(err)
}
fs.readdir(contentRoot, function (err, results) {
if (err) return callback(err)
const dirs = results.sort()
const contentId = dirs.find(dir =>
OutputCacheManager.BUILD_REGEX.test(dir)
)
if (contentId) {
callback(null, Path.join(contentRoot, contentId))
} else {
// make a content directory
OutputCacheManager.generateBuildId(function (err, contentId) {
if (err) {
return callback(err)
}
const contentDir = Path.join(contentRoot, contentId)
fs.mkdir(contentDir, { recursive: true }, function (err) {
if (err) {
return callback(err)
}
callback(null, contentDir)
})
})
}
})
})
},
archiveLogs(outputFiles, compileDir, outputDir, buildId, callback) {
const archiveDir = Path.join(
outputDir,
OutputCacheManager.ARCHIVE_SUBDIR,
buildId
)
logger.debug({ dir: archiveDir }, 'archiving log files for project')
fs.mkdir(archiveDir, { recursive: true }, function (err) {
if (err) {
return callback(err)
}
const dirCache = new Set()
dirCache.add(archiveDir)
async.mapSeries(
outputFiles,
function (file, cb) {
const src = Path.join(compileDir, file.path)
const dst = Path.join(archiveDir, file.path)
OutputCacheManager._checkIfShouldArchive(
src,
function (err, shouldArchive) {
if (err) {
return cb(err)
}
if (!shouldArchive) {
return cb()
}
OutputCacheManager._copyFile(src, dst, dirCache, cb)
}
)
},
callback
)
})
},
expireOutputFiles(outputDir, options, callback) {
// look in compileDir for build dirs and delete if > N or age of mod time > T
const cleanupAll = cb => {
fs.rm(outputDir, { force: true, recursive: true }, err => {
if (err) {
return cb(err)
}
// Drop reference after successful cleanup of the output dir.
OLDEST_BUILD_DIR.delete(outputDir)
cb(null)
})
}
const cacheRoot = Path.join(outputDir, OutputCacheManager.CACHE_SUBDIR)
fs.readdir(cacheRoot, function (err, results) {
if (err) {
if (err.code === 'ENOENT') {
// cache directory is empty
return cleanupAll(callback)
}
logger.error({ err, projectId: cacheRoot }, 'error clearing cache')
return callback(err)
}
const dirs = results.sort().reverse()
const currentTime = Date.now()
let oldestDirTimeToKeep = 0
const isExpired = function (dir, index) {
if (options?.keep === dir) {
// This is the directory we just created for the compile request.
oldestDirTimeToKeep = currentTime
return false
}
// remove any directories over the requested (non-null) limit
if (options?.limit != null && index > options.limit) {
return true
}
// remove any directories over the hard limit
if (index > OutputCacheManager.CACHE_LIMIT) {
return true
}
// we can get the build time from the first part of the directory name DDDD-RRRR
// DDDD is date and RRRR is random bytes
const dirTime = parseInt(dir.split('-')[0], 16)
const age = currentTime - dirTime
const expired = age > OutputCacheManager.CACHE_AGE
if (expired) {
return true
}
oldestDirTimeToKeep = dirTime
return false
}
const toRemove = _.filter(dirs, isExpired)
if (toRemove.length === dirs.length) {
// No builds left after cleanup.
return cleanupAll(callback)
}
const removeDir = (dir, cb) =>
fs.rm(
Path.join(cacheRoot, dir),
{ force: true, recursive: true },
function (err, result) {
logger.debug({ cache: cacheRoot, dir }, 'removed expired cache dir')
if (err) {
logger.error({ err, dir }, 'cache remove error')
}
cb(err, result)
}
)
async.eachSeries(
toRemove,
(dir, cb) => removeDir(dir, cb),
err => {
if (err) {
// On error: keep the timestamp in the past.
// The next iteration of the cleanup loop will retry the deletion.
return callback(err)
}
// On success: push the timestamp into the future.
OLDEST_BUILD_DIR.set(outputDir, oldestDirTimeToKeep)
callback(null)
}
)
})
},
_fileIsHidden(path) {
return path?.match(/^\.|\/\./) != null
},
_ensureParentExists(dst, dirCache, callback) {
let parent = Path.dirname(dst)
if (dirCache.has(parent)) {
callback()
} else {
fs.mkdir(parent, { recursive: true }, err => {
if (err) return callback(err)
while (!dirCache.has(parent)) {
dirCache.add(parent)
parent = Path.dirname(parent)
}
callback()
})
}
},
_copyFile(src, dst, dirCache, callback) {
OutputCacheManager._ensureParentExists(dst, dirCache, err => {
if (err) {
logger.warn(
{ err, dst },
'creating parent directory in output cache failed'
)
return callback(err, false)
}
// copy output file into the cache
fs.copyFile(src, dst, function (err) {
if (err?.code === 'ENOENT') {
logger.warn(
{ err, file: src },
'file has disappeared when copying to build cache'
)
callback(err, false)
} else if (err) {
logger.error({ err, src, dst }, 'copy error for file in cache')
callback(err)
} else {
if (Settings.clsi?.optimiseInDocker) {
// don't run any optimisations on the pdf when they are done
// in the docker container
callback()
} else {
// call the optimiser for the file too
OutputFileOptimiser.optimiseFile(src, dst, callback)
}
}
})
})
},
_checkIfShouldCopy(src, callback) {
callback(null, !Path.basename(src).match(/^strace/))
},
_checkIfShouldArchive(src, callback) {
if (Path.basename(src).match(/^strace/)) {
return callback(null, true)
}
const basename = Path.basename(src)
if (
Settings.clsi?.archive_logs &&
['output.log', 'output.blg'].includes(basename)
) {
return callback(null, true)
}
callback(null, false)
},
}
OutputCacheManager.promises = {
expireOutputFiles: promisify(OutputCacheManager.expireOutputFiles),
saveOutputFiles: promisify(OutputCacheManager.saveOutputFiles),
saveOutputFilesInBuildDir: promisify(
OutputCacheManager.saveOutputFilesInBuildDir
),
queueDirOperation,
}

View File

@@ -0,0 +1,23 @@
const OutputFileArchiveManager = require('./OutputFileArchiveManager')
const { expressify } = require('@overleaf/promise-utils')
const { pipeline } = require('node:stream/promises')
async function createOutputZip(req, res) {
const {
project_id: projectId,
user_id: userId,
build_id: buildId,
} = req.params
const archive = await OutputFileArchiveManager.archiveFilesForBuild(
projectId,
userId,
buildId
)
res.attachment('output.zip')
res.setHeader('X-Content-Type-Options', 'nosniff')
await pipeline(archive, res)
}
module.exports = { createOutputZip: expressify(createOutputZip) }

View File

@@ -0,0 +1,113 @@
const archiver = require('archiver')
const OutputCacheManager = require('./OutputCacheManager')
const OutputFileFinder = require('./OutputFileFinder')
const Settings = require('@overleaf/settings')
const { open } = require('node:fs/promises')
const { NotFoundError } = require('./Errors')
const logger = require('@overleaf/logger')
// NOTE: Updating this list requires a corresponding change in
// * services/web/frontend/js/features/pdf-preview/util/file-list.ts
const ignoreFiles = ['output.fls', 'output.fdb_latexmk']
function getContentDir(projectId, userId) {
let subDir
if (userId != null) {
subDir = `${projectId}-${userId}`
} else {
subDir = projectId
}
return `${Settings.path.outputDir}/${subDir}/`
}
module.exports = {
async archiveFilesForBuild(projectId, userId, build) {
logger.debug({ projectId, userId, build }, 'Will create zip file')
const contentDir = getContentDir(projectId, userId)
const outputFiles = await this._getAllOutputFiles(
contentDir,
projectId,
userId,
build
)
const archive = archiver('zip')
archive.on('error', err => {
logger.warn(
{ err, projectId, userId, build },
'error emitted when creating output files archive'
)
})
archive.on('warning', err => {
logger.warn(
{ err, projectId, userId, build },
'warning emitted when creating output files archive'
)
})
const missingFiles = []
for (const { path } of outputFiles) {
let fileHandle
try {
fileHandle = await open(
`${contentDir}${OutputCacheManager.path(build, path)}`
)
} catch (error) {
logger.warn(
{ path, error, projectId, userId, build },
'error opening file to add to output files archive'
)
missingFiles.push(path)
continue
}
const fileStream = fileHandle.createReadStream()
archive.append(fileStream, { name: path })
}
if (missingFiles.length > 0) {
archive.append(missingFiles.join('\n'), {
name: 'missing_files.txt',
})
}
archive.finalize().catch(error => {
logger.error(
{ error, projectId, userId, build },
'error finalizing output files archive'
)
})
return archive
},
async _getAllOutputFiles(contentDir, projectId, userId, build) {
try {
const { outputFiles } = await OutputFileFinder.promises.findOutputFiles(
[],
`${contentDir}${OutputCacheManager.path(build, '.')}`
)
return outputFiles.filter(
// Ignore the pdf, clsi-cache tar-ball and also ignore the files ignored by the frontend.
({ path }) =>
path !== 'output.pdf' &&
path !== 'output.tar.gz' &&
!ignoreFiles.includes(path)
)
} catch (error) {
if (
error.code === 'ENOENT' ||
error.code === 'ENOTDIR' ||
error.code === 'EACCES'
) {
throw new NotFoundError('Output files not found')
}
throw error
}
},
}

View File

@@ -0,0 +1,53 @@
const Path = require('node:path')
const fs = require('node:fs')
const { callbackifyMultiResult } = require('@overleaf/promise-utils')
async function walkFolder(compileDir, d, files, allEntries) {
const dirents = await fs.promises.readdir(Path.join(compileDir, d), {
withFileTypes: true,
})
for (const dirent of dirents) {
const p = Path.join(d, dirent.name)
if (dirent.isDirectory()) {
await walkFolder(compileDir, p, files, allEntries)
allEntries.push(p + '/')
} else if (dirent.isFile()) {
files.push(p)
allEntries.push(p)
} else {
allEntries.push(p)
}
}
}
async function findOutputFiles(resources, directory) {
const files = []
const allEntries = []
await walkFolder(directory, '', files, allEntries)
const incomingResources = new Set(resources.map(resource => resource.path))
const outputFiles = []
for (const path of files) {
if (incomingResources.has(path)) continue
if (path === '.project-sync-state') continue
outputFiles.push({
path,
type: Path.extname(path).replace(/^\./, '') || undefined,
})
}
return {
outputFiles,
allEntries,
}
}
module.exports = {
findOutputFiles: callbackifyMultiResult(findOutputFiles, [
'outputFiles',
'allEntries',
]),
promises: {
findOutputFiles,
},
}

View File

@@ -0,0 +1,100 @@
/* eslint-disable
no-return-assign,
no-undef,
no-unused-vars,
n/no-deprecated-api,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let OutputFileOptimiser
const fs = require('node:fs')
const Path = require('node:path')
const { spawn } = require('node:child_process')
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
const _ = require('lodash')
module.exports = OutputFileOptimiser = {
optimiseFile(src, dst, callback) {
// check output file (src) and see if we can optimise it, storing
// the result in the build directory (dst)
if (callback == null) {
callback = function () {}
}
if (src.match(/\/output\.pdf$/)) {
return OutputFileOptimiser.checkIfPDFIsOptimised(
src,
function (err, isOptimised) {
if (err != null || isOptimised) {
return callback(null)
}
return OutputFileOptimiser.optimisePDF(src, dst, callback)
}
)
} else {
return callback(null)
}
},
checkIfPDFIsOptimised(file, callback) {
const SIZE = 16 * 1024 // check the header of the pdf
const result = Buffer.alloc(SIZE) // fills with zeroes by default
return fs.open(file, 'r', function (err, fd) {
if (err != null) {
return callback(err)
}
return fs.read(fd, result, 0, SIZE, 0, (errRead, bytesRead, buffer) =>
fs.close(fd, function (errClose) {
if (errRead != null) {
return callback(errRead)
}
if (typeof errReadClose !== 'undefined' && errReadClose !== null) {
return callback(errClose)
}
const isOptimised =
buffer.toString('ascii').indexOf('/Linearized 1') >= 0
return callback(null, isOptimised)
})
)
})
},
optimisePDF(src, dst, callback) {
if (callback == null) {
callback = function () {}
}
const tmpOutput = dst + '.opt'
const args = ['--linearize', '--newline-before-endstream', src, tmpOutput]
logger.debug({ args }, 'running qpdf command')
const timer = new Metrics.Timer('qpdf')
const proc = spawn('qpdf', args, { stdio: 'ignore' })
callback = _.once(callback) // avoid double call back for error and close event
proc.on('error', function (err) {
logger.warn({ err, args }, 'qpdf failed')
return callback(null)
}) // ignore the error
return proc.on('close', function (code) {
timer.done()
if (code !== 0) {
logger.warn({ code, args }, 'qpdf returned error')
return callback(null) // ignore the error
}
return fs.rename(tmpOutput, dst, function (err) {
if (err != null) {
logger.warn(
{ tmpOutput, dst },
'failed to rename output of qpdf command'
)
}
return callback(null)
})
})
}, // ignore the error
}

View File

@@ -0,0 +1,247 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let ProjectPersistenceManager
const UrlCache = require('./UrlCache')
const CompileManager = require('./CompileManager')
const async = require('async')
const logger = require('@overleaf/logger')
const oneDay = 24 * 60 * 60 * 1000
const Metrics = require('@overleaf/metrics')
const Settings = require('@overleaf/settings')
const { callbackify } = require('node:util')
const Path = require('node:path')
const fs = require('node:fs')
// projectId -> timestamp mapping.
const LAST_ACCESS = new Map()
async function collectDiskStats() {
const paths = [
Settings.path.compilesDir,
Settings.path.outputDir,
Settings.path.clsiCacheDir,
]
const diskStats = {}
for (const path of paths) {
try {
const { blocks, bavail, bsize } = await fs.promises.statfs(path)
const stats = {
// Warning: these values will be wrong by a factor in Docker-for-Mac.
// See https://github.com/docker/for-mac/issues/2136
total: blocks * bsize, // Total size of the file system in bytes
available: bavail * bsize, // Free space available to unprivileged users.
}
const diskAvailablePercent = (stats.available / stats.total) * 100
Metrics.gauge('disk_available_percent', diskAvailablePercent, 1, {
path,
})
const lowDisk = diskAvailablePercent < 10
diskStats[path] = { stats, lowDisk }
} catch (err) {
logger.err({ err, path }, 'error getting disk usage')
}
}
return diskStats
}
async function refreshExpiryTimeout() {
for (const [path, { stats, lowDisk }] of Object.entries(
await collectDiskStats()
)) {
const lowerExpiry = ProjectPersistenceManager.EXPIRY_TIMEOUT * 0.9
if (lowDisk && Settings.project_cache_length_ms / 2 < lowerExpiry) {
logger.warn(
{
path,
stats,
newExpiryTimeoutInDays: (lowerExpiry / oneDay).toFixed(2),
},
'disk running low on space, modifying EXPIRY_TIMEOUT'
)
ProjectPersistenceManager.EXPIRY_TIMEOUT = lowerExpiry
break
}
}
}
module.exports = ProjectPersistenceManager = {
EXPIRY_TIMEOUT: Settings.project_cache_length_ms || oneDay * 2.5,
promises: {
refreshExpiryTimeout,
},
refreshExpiryTimeout: callbackify(refreshExpiryTimeout),
init() {
fs.readdir(Settings.path.compilesDir, (err, dirs) => {
if (err) {
logger.warn({ err }, 'cannot get project listing')
dirs = []
}
async.eachLimit(
dirs,
10,
(projectAndUserId, cb) => {
const compileDir = Path.join(
Settings.path.compilesDir,
projectAndUserId
)
const projectId = projectAndUserId.slice(0, 24)
fs.stat(compileDir, (err, stats) => {
if (err) {
// Schedule for immediate cleanup
LAST_ACCESS.set(projectId, 0)
} else {
// Cleanup eventually.
LAST_ACCESS.set(projectId, stats.mtime.getTime())
}
cb()
})
},
() => {
setInterval(
() => {
ProjectPersistenceManager.refreshExpiryTimeout(() => {
ProjectPersistenceManager.clearExpiredProjects(err => {
if (err) {
logger.error({ err }, 'clearing expired projects failed')
}
})
})
},
10 * 60 * 1000
)
}
)
})
// Collect disk stats frequently to have them ready the next time /metrics is scraped (60s +- jitter).
setInterval(() => {
collectDiskStats().catch(err => {
logger.err({ err }, 'low level error collecting disk stats')
})
}, 50_000)
},
markProjectAsJustAccessed(projectId, callback) {
LAST_ACCESS.set(projectId, Date.now())
callback()
},
clearExpiredProjects(callback) {
if (callback == null) {
callback = function () {}
}
return ProjectPersistenceManager._findExpiredProjectIds(
function (error, projectIds) {
if (error != null) {
return callback(error)
}
logger.debug({ projectIds }, 'clearing expired projects')
const jobs = Array.from(projectIds || []).map(projectId =>
(
projectId => callback =>
ProjectPersistenceManager.clearProjectFromCache(
projectId,
{ reason: 'expired' },
function (err) {
if (err != null) {
logger.error({ err, projectId }, 'error clearing project')
}
return callback()
}
)
)(projectId)
)
return async.series(jobs, function (error) {
if (error != null) {
return callback(error)
}
return CompileManager.clearExpiredProjects(
ProjectPersistenceManager.EXPIRY_TIMEOUT,
error => callback(error)
)
})
}
)
}, // ignore any errors from deleting directories
clearProject(projectId, userId, callback) {
if (callback == null) {
callback = function () {}
}
logger.debug({ projectId, userId }, 'clearing project for user')
return CompileManager.clearProject(projectId, userId, function (error) {
if (error != null) {
return callback(error)
}
return ProjectPersistenceManager.clearProjectFromCache(
projectId,
{ reason: 'cleared' },
function (error) {
if (error != null) {
return callback(error)
}
return callback()
}
)
})
},
clearProjectFromCache(projectId, options, callback) {
if (callback == null) {
callback = function () {}
}
logger.debug({ projectId }, 'clearing project from cache')
return UrlCache.clearProject(projectId, options, function (error) {
if (error != null) {
logger.err({ error, projectId }, 'error clearing project from cache')
return callback(error)
}
return ProjectPersistenceManager._clearProjectFromDatabase(
projectId,
function (error) {
if (error != null) {
logger.err(
{ error, projectId },
'error clearing project from database'
)
}
return callback(error)
}
)
})
},
_clearProjectFromDatabase(projectId, callback) {
LAST_ACCESS.delete(projectId)
callback()
},
_findExpiredProjectIds(callback) {
const expiredFrom = Date.now() - ProjectPersistenceManager.EXPIRY_TIMEOUT
const expiredProjectsIds = []
for (const [projectId, lastAccess] of LAST_ACCESS.entries()) {
if (lastAccess < expiredFrom) {
expiredProjectsIds.push(projectId)
}
}
// ^ may be a fairly busy loop, continue detached.
setTimeout(() => callback(null, expiredProjectsIds), 0)
},
}
logger.debug(
{ EXPIRY_TIMEOUT: ProjectPersistenceManager.EXPIRY_TIMEOUT },
'project assets kept timeout'
)

View File

@@ -0,0 +1,250 @@
const settings = require('@overleaf/settings')
const OutputCacheManager = require('./OutputCacheManager')
const VALID_COMPILERS = ['pdflatex', 'latex', 'xelatex', 'lualatex']
const MAX_TIMEOUT = 600
const EDITOR_ID_REGEX = /^[a-f0-9-]{36}$/ // UUID
function parse(body, callback) {
const response = {}
if (body.compile == null) {
return callback(
new Error('top level object should have a compile attribute')
)
}
const { compile } = body
if (!compile.options) {
compile.options = {}
}
try {
response.metricsOpts = {
path: _parseAttribute('metricsPath', compile.options.metricsPath, {
default: '',
type: 'string',
}),
method: _parseAttribute('metricsMethod', compile.options.metricsMethod, {
default: '',
type: 'string',
}),
// Will be populated later. Must always be populated for prom library.
compile: 'initial',
}
response.compiler = _parseAttribute('compiler', compile.options.compiler, {
validValues: VALID_COMPILERS,
default: 'pdflatex',
type: 'string',
})
response.compileFromClsiCache = _parseAttribute(
'compileFromClsiCache',
compile.options.compileFromClsiCache,
{ default: false, type: 'boolean' }
)
response.populateClsiCache = _parseAttribute(
'populateClsiCache',
compile.options.populateClsiCache,
{ default: false, type: 'boolean' }
)
response.enablePdfCaching = _parseAttribute(
'enablePdfCaching',
compile.options.enablePdfCaching,
{
default: false,
type: 'boolean',
}
)
response.pdfCachingMinChunkSize = _parseAttribute(
'pdfCachingMinChunkSize',
compile.options.pdfCachingMinChunkSize,
{
default: settings.pdfCachingMinChunkSize,
type: 'number',
}
)
response.timeout = _parseAttribute('timeout', compile.options.timeout, {
default: MAX_TIMEOUT,
type: 'number',
})
response.imageName = _parseAttribute(
'imageName',
compile.options.imageName,
{
type: 'string',
validValues:
settings.clsi &&
settings.clsi.docker &&
settings.clsi.docker.allowedImages,
}
)
response.draft = _parseAttribute('draft', compile.options.draft, {
default: false,
type: 'boolean',
})
response.stopOnFirstError = _parseAttribute(
'stopOnFirstError',
compile.options.stopOnFirstError,
{
default: false,
type: 'boolean',
}
)
response.check = _parseAttribute('check', compile.options.check, {
type: 'string',
})
response.flags = _parseAttribute('flags', compile.options.flags, {
default: [],
type: 'object',
})
if (settings.allowedCompileGroups) {
response.compileGroup = _parseAttribute(
'compileGroup',
compile.options.compileGroup,
{
validValues: settings.allowedCompileGroups,
default: '',
type: 'string',
}
)
}
// The syncType specifies whether the request contains all
// resources (full) or only those resources to be updated
// in-place (incremental).
response.syncType = _parseAttribute('syncType', compile.options.syncType, {
validValues: ['full', 'incremental'],
type: 'string',
})
// The syncState is an identifier passed in with the request
// which has the property that it changes when any resource is
// added, deleted, moved or renamed.
//
// on syncType full the syncState identifier is passed in and
// stored
//
// on syncType incremental the syncState identifier must match
// the stored value
response.syncState = _parseAttribute(
'syncState',
compile.options.syncState,
{ type: 'string' }
)
if (response.timeout > MAX_TIMEOUT) {
response.timeout = MAX_TIMEOUT
}
response.timeout = response.timeout * 1000 // milliseconds
response.resources = (compile.resources || []).map(resource =>
_parseResource(resource)
)
const rootResourcePath = _parseAttribute(
'rootResourcePath',
compile.rootResourcePath,
{
default: 'main.tex',
type: 'string',
}
)
response.rootResourcePath = _checkPath(rootResourcePath)
response.editorId = _parseAttribute('editorId', compile.options.editorId, {
type: 'string',
regex: EDITOR_ID_REGEX,
})
response.buildId = _parseAttribute('buildId', compile.options.buildId, {
type: 'string',
regex: OutputCacheManager.BUILD_REGEX,
})
} catch (error1) {
const error = error1
return callback(error)
}
callback(null, response)
}
function _parseResource(resource) {
let modified
if (resource.path == null || typeof resource.path !== 'string') {
throw new Error('all resources should have a path attribute')
}
if (resource.modified != null) {
modified = new Date(resource.modified)
if (isNaN(modified.getTime())) {
throw new Error(
`resource modified date could not be understood: ${resource.modified}`
)
}
}
if (resource.url == null && resource.content == null) {
throw new Error(
'all resources should have either a url or content attribute'
)
}
if (resource.content != null && typeof resource.content !== 'string') {
throw new Error('content attribute should be a string')
}
if (resource.url != null && typeof resource.url !== 'string') {
throw new Error('url attribute should be a string')
}
if (resource.fallbackURL && typeof resource.fallbackURL !== 'string') {
throw new Error('fallbackURL attribute should be a string')
}
return {
path: resource.path,
modified,
url: resource.url,
fallbackURL: resource.fallbackURL,
content: resource.content,
}
}
function _parseAttribute(name, attribute, options) {
if (attribute != null) {
if (options.validValues != null) {
if (options.validValues.indexOf(attribute) === -1) {
throw new Error(
`${name} attribute should be one of: ${options.validValues.join(
', '
)}`
)
}
}
if (options.type != null) {
// eslint-disable-next-line valid-typeof
if (typeof attribute !== options.type) {
throw new Error(`${name} attribute should be a ${options.type}`)
}
}
if (options.type === 'string' && options.regex instanceof RegExp) {
if (!options.regex.test(attribute)) {
throw new Error(
`${name} attribute does not match regex ${options.regex}`
)
}
}
} else {
if (options.default != null) {
return options.default
}
}
return attribute
}
function _checkPath(path) {
// check that the request does not use a relative path
for (const dir of Array.from(path.split('/'))) {
if (dir === '..') {
throw new Error('relative path in root resource')
}
}
return path
}
module.exports = { parse, MAX_TIMEOUT }

View File

@@ -0,0 +1,116 @@
const Path = require('node:path')
const fs = require('node:fs')
const logger = require('@overleaf/logger')
const Errors = require('./Errors')
const SafeReader = require('./SafeReader')
module.exports = {
// The sync state is an identifier which must match for an
// incremental update to be allowed.
//
// The initial value is passed in and stored on a full
// compile, along with the list of resources..
//
// Subsequent incremental compiles must come with the same value - if
// not they will be rejected with a 409 Conflict response. The
// previous list of resources is returned.
//
// An incremental compile can only update existing files with new
// content. The sync state identifier must change if any docs or
// files are moved, added, deleted or renamed.
SYNC_STATE_FILE: '.project-sync-state',
SYNC_STATE_MAX_SIZE: 128 * 1024,
saveProjectState(state, resources, basePath, callback) {
const stateFile = Path.join(basePath, this.SYNC_STATE_FILE)
if (state == null) {
// remove the file if no state passed in
logger.debug({ state, basePath }, 'clearing sync state')
fs.unlink(stateFile, function (err) {
if (err && err.code !== 'ENOENT') {
return callback(err)
} else {
return callback()
}
})
} else {
logger.debug({ state, basePath }, 'writing sync state')
const resourceList = resources.map(resource => resource.path)
fs.writeFile(
stateFile,
[...resourceList, `stateHash:${state}`].join('\n'),
callback
)
}
},
checkProjectStateMatches(state, basePath, callback) {
const stateFile = Path.join(basePath, this.SYNC_STATE_FILE)
const size = this.SYNC_STATE_MAX_SIZE
SafeReader.readFile(
stateFile,
size,
'utf8',
function (err, result, bytesRead) {
if (err) {
return callback(err)
}
if (bytesRead === size) {
logger.error(
{ file: stateFile, size, bytesRead },
'project state file truncated'
)
}
const array = result ? result.toString().split('\n') : []
const adjustedLength = Math.max(array.length, 1)
const resourceList = array.slice(0, adjustedLength - 1)
const oldState = array[adjustedLength - 1]
const newState = `stateHash:${state}`
logger.debug(
{ state, oldState, basePath, stateMatches: newState === oldState },
'checking sync state'
)
if (newState !== oldState) {
return callback(
new Errors.FilesOutOfSyncError(
'invalid state for incremental update'
)
)
} else {
const resources = resourceList.map(path => ({ path }))
callback(null, resources)
}
}
)
},
checkResourceFiles(resources, allFiles, basePath, callback) {
// check the paths are all relative to current directory
const containsRelativePath = resource => {
const dirs = resource.path.split('/')
return dirs.indexOf('..') !== -1
}
if (resources.some(containsRelativePath)) {
return callback(new Error('relative path in resource file list'))
}
// check if any of the input files are not present in list of files
const seenFiles = new Set(allFiles)
const missingFiles = resources
.map(resource => resource.path)
.filter(path => !seenFiles.has(path))
if (missingFiles.length > 0) {
logger.err(
{ missingFiles, basePath, allFiles, resources },
'missing input files for project'
)
return callback(
new Errors.FilesOutOfSyncError(
'resource files missing in incremental update'
)
)
} else {
callback()
}
},
}

View File

@@ -0,0 +1,384 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
no-useless-escape,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let ResourceWriter
const { promisify } = require('node:util')
const UrlCache = require('./UrlCache')
const Path = require('node:path')
const fs = require('node:fs')
const async = require('async')
const OutputFileFinder = require('./OutputFileFinder')
const ResourceStateManager = require('./ResourceStateManager')
const Metrics = require('./Metrics')
const logger = require('@overleaf/logger')
const settings = require('@overleaf/settings')
const parallelFileDownloads = settings.parallelFileDownloads || 1
module.exports = ResourceWriter = {
syncResourcesToDisk(request, basePath, callback) {
if (callback == null) {
callback = function () {}
}
if (request.syncType === 'incremental') {
logger.debug(
{ projectId: request.project_id, userId: request.user_id },
'incremental sync'
)
return ResourceStateManager.checkProjectStateMatches(
request.syncState,
basePath,
function (error, resourceList) {
if (error != null) {
return callback(error)
}
return ResourceWriter._removeExtraneousFiles(
request,
resourceList,
basePath,
function (error, outputFiles, allFiles) {
if (error != null) {
return callback(error)
}
return ResourceStateManager.checkResourceFiles(
resourceList,
allFiles,
basePath,
function (error) {
if (error != null) {
return callback(error)
}
return ResourceWriter.saveIncrementalResourcesToDisk(
request.project_id,
request.resources,
basePath,
function (error) {
if (error != null) {
return callback(error)
}
return callback(null, resourceList)
}
)
}
)
}
)
}
)
}
logger.debug(
{ projectId: request.project_id, userId: request.user_id },
'full sync'
)
UrlCache.createProjectDir(request.project_id, error => {
if (error != null) {
return callback(error)
}
ResourceWriter.saveAllResourcesToDisk(
request,
basePath,
function (error) {
if (error != null) {
return callback(error)
}
return ResourceStateManager.saveProjectState(
request.syncState,
request.resources,
basePath,
function (error) {
if (error != null) {
return callback(error)
}
return callback(null, request.resources)
}
)
}
)
})
},
saveIncrementalResourcesToDisk(projectId, resources, basePath, callback) {
if (callback == null) {
callback = function () {}
}
return ResourceWriter._createDirectory(basePath, error => {
if (error != null) {
return callback(error)
}
const jobs = Array.from(resources).map(resource =>
(resource => {
return callback =>
ResourceWriter._writeResourceToDisk(
projectId,
resource,
basePath,
callback
)
})(resource)
)
return async.parallelLimit(jobs, parallelFileDownloads, callback)
})
},
saveAllResourcesToDisk(request, basePath, callback) {
if (callback == null) {
callback = function () {}
}
return ResourceWriter._createDirectory(basePath, error => {
if (error != null) {
return callback(error)
}
const { project_id: projectId, resources } = request
ResourceWriter._removeExtraneousFiles(
request,
resources,
basePath,
error => {
if (error != null) {
return callback(error)
}
const jobs = Array.from(resources).map(resource =>
(resource => {
return callback =>
ResourceWriter._writeResourceToDisk(
projectId,
resource,
basePath,
callback
)
})(resource)
)
return async.parallelLimit(jobs, parallelFileDownloads, callback)
}
)
})
},
_createDirectory(basePath, callback) {
if (callback == null) {
callback = function () {}
}
return fs.mkdir(basePath, function (err) {
if (err != null) {
if (err.code === 'EEXIST') {
return callback()
} else {
logger.debug({ err, dir: basePath }, 'error creating directory')
return callback(err)
}
} else {
return callback()
}
})
},
_removeExtraneousFiles(request, resources, basePath, _callback) {
if (_callback == null) {
_callback = function () {}
}
const timer = new Metrics.Timer(
'unlink-output-files',
1,
request.metricsOpts
)
const callback = function (error, ...result) {
timer.done()
return _callback(error, ...Array.from(result))
}
return OutputFileFinder.findOutputFiles(
resources,
basePath,
(error, outputFiles, allFiles) => {
if (error != null) {
return callback(error)
}
const jobs = []
for (const { path } of outputFiles || []) {
const shouldDelete = ResourceWriter.isExtraneousFile(path)
if (shouldDelete) {
jobs.push(callback =>
ResourceWriter._deleteFileIfNotDirectory(
Path.join(basePath, path),
callback
)
)
}
}
return async.series(jobs, function (error) {
if (error != null) {
return callback(error)
}
return callback(null, outputFiles, allFiles)
})
}
)
},
isExtraneousFile(path) {
let shouldDelete = true
if (
path.match(/^output\./) ||
path.match(/\.aux$/) ||
path.match(/^cache\//)
) {
// knitr cache
shouldDelete = false
}
if (path.match(/^output-.*/)) {
// Tikz cached figures (default case)
shouldDelete = false
}
if (path.match(/\.(pdf|dpth|md5)$/)) {
// Tikz cached figures (by extension)
shouldDelete = false
}
if (
path.match(/\.(pygtex|pygstyle)$/) ||
path.match(/(^|\/)_minted-[^\/]+\//)
) {
// minted files/directory
shouldDelete = false
}
if (path.match(/\.md\.tex$/) || path.match(/(^|\/)_markdown_[^\/]+\//)) {
// markdown files/directory
shouldDelete = false
}
if (path.match(/-eps-converted-to\.pdf$/)) {
// Epstopdf generated files
shouldDelete = false
}
if (
path === 'output.tar.gz' ||
path === 'output.synctex.gz' ||
path === 'output.pdfxref' ||
path === 'output.pdf' ||
path === 'output.dvi' ||
path === 'output.log' ||
path === 'output.xdv' ||
path === 'output.stdout' ||
path === 'output.stderr'
) {
shouldDelete = true
}
if (path === 'output.tex') {
// created by TikzManager if present in output files
shouldDelete = true
}
return shouldDelete
},
_deleteFileIfNotDirectory(path, callback) {
if (callback == null) {
callback = function () {}
}
return fs.stat(path, function (error, stat) {
if (error != null && error.code === 'ENOENT') {
return callback()
} else if (error != null) {
logger.err(
{ err: error, path },
'error stating file in deleteFileIfNotDirectory'
)
return callback(error)
} else if (stat.isFile()) {
return fs.unlink(path, function (error) {
if (error != null) {
logger.err(
{ err: error, path },
'error removing file in deleteFileIfNotDirectory'
)
return callback(error)
} else {
return callback()
}
})
} else {
return callback()
}
})
},
_writeResourceToDisk(projectId, resource, basePath, callback) {
if (callback == null) {
callback = function () {}
}
return ResourceWriter.checkPath(
basePath,
resource.path,
function (error, path) {
if (error != null) {
return callback(error)
}
return fs.mkdir(
Path.dirname(path),
{ recursive: true },
function (error) {
if (error != null) {
return callback(error)
}
// TODO: Don't overwrite file if it hasn't been modified
if (resource.url != null) {
return UrlCache.downloadUrlToFile(
projectId,
resource.url,
resource.fallbackURL,
path,
resource.modified,
function (err) {
if (err != null) {
logger.err(
{
err,
projectId,
path,
resourceUrl: resource.url,
modified: resource.modified,
},
'error downloading file for resources'
)
Metrics.inc('download-failed')
}
return callback()
}
) // try and continue compiling even if http resource can not be downloaded at this time
} else {
fs.writeFile(path, resource.content, callback)
}
}
)
}
)
},
checkPath(basePath, resourcePath, callback) {
const path = Path.normalize(Path.join(basePath, resourcePath))
if (path.slice(0, basePath.length + 1) !== basePath + '/') {
return callback(new Error('resource path is outside root directory'))
} else {
return callback(null, path)
}
},
}
module.exports.promises = {
syncResourcesToDisk: promisify(ResourceWriter.syncResourcesToDisk),
saveIncrementalResourcesToDisk: promisify(
ResourceWriter.saveIncrementalResourcesToDisk
),
saveAllResourcesToDisk: promisify(ResourceWriter.saveAllResourcesToDisk),
checkPath: promisify(ResourceWriter.checkPath),
}

View File

@@ -0,0 +1,62 @@
/* eslint-disable
no-unused-vars,
n/no-deprecated-api,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let SafeReader
const fs = require('node:fs')
const logger = require('@overleaf/logger')
module.exports = SafeReader = {
// safely read up to size bytes from a file and return result as a
// string
readFile(file, size, encoding, callback) {
if (callback == null) {
callback = function () {}
}
return fs.open(file, 'r', function (err, fd) {
if (err != null && err.code === 'ENOENT') {
return callback()
}
if (err != null) {
return callback(err)
}
// safely return always closing the file
const callbackWithClose = (err, ...result) =>
fs.close(fd, function (err1) {
if (err != null) {
return callback(err)
}
if (err1 != null) {
return callback(err1)
}
return callback(null, ...Array.from(result))
})
const buff = Buffer.alloc(size) // fills with zeroes by default
return fs.read(
fd,
buff,
0,
buff.length,
0,
function (err, bytesRead, buffer) {
if (err != null) {
return callbackWithClose(err)
}
const result = buffer.toString(encoding, 0, bytesRead)
return callbackWithClose(null, result, bytesRead)
}
)
})
},
}

View File

@@ -0,0 +1,89 @@
/* eslint-disable
no-cond-assign,
no-unused-vars,
n/no-deprecated-api,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let ForbidSymlinks
const Path = require('node:path')
const fs = require('node:fs')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
module.exports = ForbidSymlinks = function (staticFn, root, options) {
const expressStatic = staticFn(root, options)
const basePath = Path.resolve(root)
return function (req, res, next) {
let file, projectId, result
const path = req.url
// check that the path is of the form /project_id_or_name/path/to/file.log
if ((result = path.match(/^\/([a-zA-Z0-9_-]+)\/(.*)$/s))) {
projectId = result[1]
file = result[2]
if (path !== `/${projectId}/${file}`) {
logger.warn({ path }, 'unrecognized file request')
return res.sendStatus(404)
}
} else {
logger.warn({ path }, 'unrecognized file request')
return res.sendStatus(404)
}
// check that the file does not use a relative path
for (const dir of Array.from(file.split('/'))) {
if (dir === '..') {
logger.warn({ path }, 'attempt to use a relative path')
return res.sendStatus(404)
}
}
// check that the requested path is normalized
const requestedFsPath = `${basePath}/${projectId}/${file}`
if (requestedFsPath !== Path.normalize(requestedFsPath)) {
logger.error(
{ path: requestedFsPath },
'requestedFsPath is not normalized'
)
return res.sendStatus(404)
}
// check that the requested path is not a symlink
return fs.realpath(requestedFsPath, function (err, realFsPath) {
if (err != null) {
if (err.code === 'ENOENT') {
return res.sendStatus(404)
} else {
logger.error(
{
err,
requestedFsPath,
realFsPath,
path: req.params[0],
projectId: req.params.project_id,
},
'error checking file access'
)
return res.sendStatus(500)
}
} else if (requestedFsPath !== realFsPath) {
logger.warn(
{
requestedFsPath,
realFsPath,
path: req.params[0],
projectId: req.params.project_id,
},
'trying to access a different file (symlink), aborting'
)
return res.sendStatus(404)
} else {
return expressStatic(req, res, next)
}
})
}
}

View File

@@ -0,0 +1,113 @@
const Path = require('node:path')
/**
* Parse output from the `synctex view` command
*/
function parseViewOutput(output) {
return _parseOutput(output, (record, label, value) => {
switch (label) {
case 'Page':
_setIntProp(record, 'page', value)
break
case 'h':
_setFloatProp(record, 'h', value)
break
case 'v':
_setFloatProp(record, 'v', value)
break
case 'W':
_setFloatProp(record, 'width', value)
break
case 'H':
_setFloatProp(record, 'height', value)
break
}
})
}
/**
* Parse output from the `synctex edit` command
*/
function parseEditOutput(output, baseDir) {
return _parseOutput(output, (record, label, value) => {
switch (label) {
case 'Input':
if (Path.isAbsolute(value)) {
record.file = Path.relative(baseDir, value)
} else {
record.file = value
}
break
case 'Line':
_setIntProp(record, 'line', value)
break
case 'Column':
_setIntProp(record, 'column', value)
break
}
})
}
/**
* Generic parser for synctex output
*
* Parses the output into records. Each line is split into a label and a value,
* which are then sent to `processLine` for further processing.
*/
function _parseOutput(output, processLine) {
const lines = output.split('\n')
let currentRecord = null
const records = []
for (const line of lines) {
const [label, value] = _splitLine(line)
// A line that starts with 'Output:' indicates a new record
if (label === 'Output') {
// Start new record
currentRecord = {}
records.push(currentRecord)
continue
}
// Ignore the line if we're not in a record yet
if (currentRecord == null) {
continue
}
// Process the line
processLine(currentRecord, label, value)
}
return records
}
/**
* Split a line in label and value components.
*
* The components are separated by a colon. Note that this is slightly
* different from `line.split(':', 2)`. This version puts the entirety of the
* line after the colon in the value component, even if there are more colons
* on the line.
*/
function _splitLine(line) {
const splitIndex = line.indexOf(':')
if (splitIndex === -1) {
return ['', line]
}
return [line.slice(0, splitIndex).trim(), line.slice(splitIndex + 1).trim()]
}
function _setIntProp(record, prop, value) {
const intValue = parseInt(value, 10)
if (!isNaN(intValue)) {
record[prop] = intValue
}
}
function _setFloatProp(record, prop, value) {
const floatValue = parseFloat(value)
if (!isNaN(floatValue)) {
record[prop] = floatValue
}
}
module.exports = { parseViewOutput, parseEditOutput }

View File

@@ -0,0 +1,109 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let TikzManager
const fs = require('node:fs')
const Path = require('node:path')
const { promisify } = require('node:util')
const ResourceWriter = require('./ResourceWriter')
const SafeReader = require('./SafeReader')
const logger = require('@overleaf/logger')
// for \tikzexternalize or pstool to work the main file needs to match the
// jobname. Since we set the -jobname to output, we have to create a
// copy of the main file as 'output.tex'.
module.exports = TikzManager = {
checkMainFile(compileDir, mainFile, resources, callback) {
// if there's already an output.tex file, we don't want to touch it
if (callback == null) {
callback = function () {}
}
for (const resource of Array.from(resources)) {
if (resource.path === 'output.tex') {
logger.debug(
{ compileDir, mainFile },
'output.tex already in resources'
)
return callback(null, false)
}
}
// if there's no output.tex, see if we are using tikz/pgf or pstool in the main file
return ResourceWriter.checkPath(
compileDir,
mainFile,
function (error, path) {
if (error != null) {
return callback(error)
}
return SafeReader.readFile(
path,
65536,
'utf8',
function (error, content) {
if (error != null) {
return callback(error)
}
const usesTikzExternalize =
(content != null
? content.indexOf('\\tikzexternalize')
: undefined) >= 0
const usesPsTool =
(content != null ? content.indexOf('{pstool}') : undefined) >= 0
logger.debug(
{ compileDir, mainFile, usesTikzExternalize, usesPsTool },
'checked for packages needing main file as output.tex'
)
const needsMainFile = usesTikzExternalize || usesPsTool
return callback(null, needsMainFile)
}
)
}
)
},
injectOutputFile(compileDir, mainFile, callback) {
if (callback == null) {
callback = function () {}
}
return ResourceWriter.checkPath(
compileDir,
mainFile,
function (error, path) {
if (error != null) {
return callback(error)
}
return fs.readFile(path, 'utf8', function (error, content) {
if (error != null) {
return callback(error)
}
logger.debug(
{ compileDir, mainFile },
'copied file to output.tex as project uses packages which require it'
)
// use wx flag to ensure that output file does not already exist
return fs.writeFile(
Path.join(compileDir, 'output.tex'),
content,
{ flag: 'wx' },
callback
)
})
}
)
},
}
module.exports.promises = {
checkMainFile: promisify(TikzManager.checkMainFile),
injectOutputFile: promisify(TikzManager.injectOutputFile),
}

View File

@@ -0,0 +1,132 @@
/* eslint-disable
no-return-assign,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const UrlFetcher = require('./UrlFetcher')
const Settings = require('@overleaf/settings')
const fs = require('node:fs')
const Path = require('node:path')
const { callbackify } = require('node:util')
const Metrics = require('./Metrics')
const PENDING_DOWNLOADS = new Map()
function getProjectDir(projectId) {
return Path.join(Settings.path.clsiCacheDir, projectId)
}
function getCachePath(projectId, url, lastModified) {
// The url is a filestore URL.
// It is sufficient to look at the path and mtime for uniqueness.
const mtime = (lastModified && lastModified.getTime()) || 0
const key = new URL(url).pathname.replace(/\//g, '-') + '-' + mtime
return Path.join(getProjectDir(projectId), key)
}
async function clearProject(projectId, options) {
const timer = new Metrics.Timer('url_cache', {
status: options?.reason || 'unknown',
path: 'delete',
})
await fs.promises.rm(getProjectDir(projectId), {
force: true,
recursive: true,
})
timer.done()
}
async function createProjectDir(projectId) {
await fs.promises.mkdir(getProjectDir(projectId), { recursive: true })
}
async function downloadUrlToFile(
projectId,
url,
fallbackURL,
destPath,
lastModified
) {
const cachePath = getCachePath(projectId, url, lastModified)
try {
const timer = new Metrics.Timer('url_cache', {
status: 'cache-hit',
path: 'copy',
})
try {
await fs.promises.copyFile(cachePath, destPath)
} catch (err) {
if (err.code === 'ENOENT' && fallbackURL) {
const fallbackPath = getCachePath(projectId, fallbackURL, lastModified)
await fs.promises.copyFile(fallbackPath, destPath)
} else {
throw err
}
}
// the metric is only updated if the file is present in the cache
timer.done()
return
} catch (e) {
if (e.code !== 'ENOENT') {
throw e
}
}
// time the download
{
const timer = new Metrics.Timer('url_cache', {
status: 'cache-miss',
path: 'download',
})
try {
await download(url, fallbackURL, cachePath)
} finally {
timer.done()
}
}
// time the file copy
{
const timer = new Metrics.Timer('url_cache', {
status: 'cache-miss',
path: 'copy',
})
await fs.promises.copyFile(cachePath, destPath)
timer.done()
}
}
async function download(url, fallbackURL, cachePath) {
let pending = PENDING_DOWNLOADS.get(cachePath)
if (pending) {
return pending
}
pending = UrlFetcher.promises.pipeUrlToFileWithRetry(
url,
fallbackURL,
cachePath
)
PENDING_DOWNLOADS.set(cachePath, pending)
try {
await pending
} finally {
PENDING_DOWNLOADS.delete(cachePath)
}
}
module.exports = {
clearProject: callbackify(clearProject),
createProjectDir: callbackify(createProjectDir),
downloadUrlToFile: callbackify(downloadUrlToFile),
promises: {
clearProject,
createProjectDir,
downloadUrlToFile,
},
}

View File

@@ -0,0 +1,122 @@
const fs = require('node:fs')
const logger = require('@overleaf/logger')
const Settings = require('@overleaf/settings')
const {
CustomHttpAgent,
CustomHttpsAgent,
fetchStream,
RequestFailedError,
} = require('@overleaf/fetch-utils')
const { URL } = require('node:url')
const { pipeline } = require('node:stream/promises')
const Metrics = require('./Metrics')
const MAX_CONNECT_TIME = 1000
const httpAgent = new CustomHttpAgent({ connectTimeout: MAX_CONNECT_TIME })
const httpsAgent = new CustomHttpsAgent({ connectTimeout: MAX_CONNECT_TIME })
async function pipeUrlToFileWithRetry(url, fallbackURL, filePath) {
let remainingAttempts = 3
let lastErr
while (remainingAttempts-- > 0) {
const timer = new Metrics.Timer('url_fetcher', {
path: lastErr ? ' retry' : 'fetch',
})
try {
await pipeUrlToFile(url, fallbackURL, filePath)
timer.done({ status: 'success' })
return
} catch (err) {
timer.done({ status: 'error' })
logger.warn(
{ err, url, filePath, remainingAttempts },
'error downloading url'
)
lastErr = err
}
}
throw lastErr
}
async function pipeUrlToFile(url, fallbackURL, filePath) {
const u = new URL(url)
if (
Settings.filestoreDomainOveride &&
u.host !== Settings.apis.clsiPerf.host
) {
url = `${Settings.filestoreDomainOveride}${u.pathname}${u.search}`
}
if (fallbackURL) {
const u2 = new URL(fallbackURL)
if (
Settings.filestoreDomainOveride &&
u2.host !== Settings.apis.clsiPerf.host
) {
fallbackURL = `${Settings.filestoreDomainOveride}${u2.pathname}${u2.search}`
}
}
let stream
try {
stream = await fetchStream(url, {
signal: AbortSignal.timeout(60 * 1000),
// provide a function to get the agent for each request
// as there may be multiple requests with different protocols
// due to redirects.
agent: _url => (_url.protocol === 'https:' ? httpsAgent : httpAgent),
})
} catch (err) {
if (
fallbackURL &&
err instanceof RequestFailedError &&
err.response.status === 404
) {
stream = await fetchStream(fallbackURL, {
signal: AbortSignal.timeout(60 * 1000),
// provide a function to get the agent for each request
// as there may be multiple requests with different protocols
// due to redirects.
agent: _url => (_url.protocol === 'https:' ? httpsAgent : httpAgent),
})
url = fallbackURL
} else {
throw err
}
}
const source = inferSource(url)
Metrics.inc('url_source', 1, { path: source })
const atomicWrite = filePath + '~'
try {
const output = fs.createWriteStream(atomicWrite)
await pipeline(stream, output)
await fs.promises.rename(atomicWrite, filePath)
Metrics.count('UrlFetcher.downloaded_bytes', output.bytesWritten, {
path: source,
})
} catch (err) {
try {
await fs.promises.unlink(atomicWrite)
} catch (e) {}
throw err
}
}
const BUCKET_REGEX = /\/bucket\/([^/]+)\/key\//
function inferSource(url) {
if (url.includes(Settings.apis.clsiPerf.host)) {
return 'clsi-perf'
} else if (url.includes('/project/') && url.includes('/file/')) {
return 'user-files'
} else if (url.includes('/key/')) {
const match = url.match(BUCKET_REGEX)
if (match) return match[1]
}
return 'unknown'
}
module.exports.promises = {
pipeUrlToFileWithRetry,
}

View File

@@ -0,0 +1,67 @@
const { NoXrefTableError } = require('./Errors')
const fs = require('node:fs')
const { O_RDONLY, O_NOFOLLOW } = fs.constants
const MAX_XREF_FILE_SIZE = 1024 * 1024
/** Parse qpdf --show-xref output to get a table of xref entries
*
* @param {string} filePath
* @param {number} pdfFileSize
* @returns
*/
async function parseXrefTable(filePath, pdfFileSize) {
try {
// the xref table will be written to output.pdfxref when available
const xRefFilePath = filePath + 'xref'
// check the size of the file (as it is untrusted)
const stats = await fs.promises.stat(xRefFilePath)
if (!stats.isFile()) {
throw new NoXrefTableError('xref file invalid type')
}
if (stats.size === 0) {
throw new NoXrefTableError('xref file empty')
}
if (stats.size > MAX_XREF_FILE_SIZE) {
throw new NoXrefTableError('xref file too large')
}
const content = await fs.promises.readFile(xRefFilePath, {
encoding: 'ascii',
flag: O_RDONLY | O_NOFOLLOW,
})
// the qpdf xref table output looks like this:
//
// 3/0: uncompressed; offset = 194159
//
// we only need the uncompressed objects
const matches = content.matchAll(
// put an upper limit of 10^10 on all the matched numbers for safety
// ignore the generation id in "id/gen"
// in a linearized pdf all objects must have generation number 0
/^\d{1,9}\/\d{1,9}: uncompressed; offset = (\d{1,9})$/gm
)
// include a zero-index object for backwards compatibility with
// our existing xref table parsing code
const xRefEntries = [{ offset: 0 }]
// extract all the xref table entries
for (const match of matches) {
const offset = parseInt(match[1], 10)
xRefEntries.push({ offset, uncompressed: true })
}
if (xRefEntries.length === 1) {
throw new NoXrefTableError('xref file has no objects')
}
return { xRefEntries }
} catch (err) {
if (err instanceof NoXrefTableError) {
throw err
} else if (err.code) {
throw new NoXrefTableError(`xref file error ${err.code}`)
} else {
throw new NoXrefTableError('xref file parse error')
}
}
}
module.exports = {
parseXrefTable,
}

0
services/clsi/bin/.gitignore vendored Normal file
View File

View File

@@ -0,0 +1,4 @@
#!/bin/bash
set -e;
MOCHA="node_modules/.bin/mocha --recursive --reporter spec --timeout 15000"
$MOCHA "$@"

View File

@@ -0,0 +1,11 @@
clsi
--data-dirs=cache,compiles,output
--dependencies=
--docker-repos=gcr.io/overleaf-ops,us-east1-docker.pkg.dev/overleaf-ops/ol-docker
--env-add=ENABLE_PDF_CACHING="true",PDF_CACHING_ENABLE_WORKER_POOL="true",ALLOWED_IMAGES=quay.io/sharelatex/texlive-full:2017.1,TEXLIVE_IMAGE=quay.io/sharelatex/texlive-full:2017.1,TEX_LIVE_IMAGE_NAME_OVERRIDE=gcr.io/overleaf-ops,TEXLIVE_IMAGE_USER="tex",DOCKER_RUNNER="true",COMPILES_HOST_DIR=$PWD/compiles,OUTPUT_HOST_DIR=$PWD/output
--env-pass-through=
--esmock-loader=False
--node-version=20.18.2
--public-repo=True
--script-version=4.7.0
--use-large-ci-runner=True

View File

@@ -0,0 +1,47 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
command: npm run test:unit:_run
environment:
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
test_acceptance:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
environment:
ELASTIC_SEARCH_DSN: es:9200
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
ENABLE_PDF_CACHING: "true"
PDF_CACHING_ENABLE_WORKER_POOL: "true"
ALLOWED_IMAGES: quay.io/sharelatex/texlive-full:2017.1
TEXLIVE_IMAGE: quay.io/sharelatex/texlive-full:2017.1
TEX_LIVE_IMAGE_NAME_OVERRIDE: gcr.io/overleaf-ops
TEXLIVE_IMAGE_USER: "tex"
DOCKER_RUNNER: "true"
COMPILES_HOST_DIR: $PWD/compiles
OUTPUT_HOST_DIR: $PWD/output
volumes:
- ./compiles:/overleaf/services/clsi/compiles
- /var/run/docker.sock:/var/run/docker.sock
command: npm run test:acceptance
tar:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
volumes:
- ./:/tmp/build/
command: tar -czf /tmp/build/build.tar.gz --exclude=build.tar.gz --exclude-vcs .
user: root

View File

@@ -0,0 +1,54 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
build:
context: ../..
dockerfile: services/clsi/Dockerfile
target: base
volumes:
- .:/overleaf/services/clsi
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
working_dir: /overleaf/services/clsi
environment:
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
command: npm run --silent test:unit
test_acceptance:
build:
context: ../..
dockerfile: services/clsi/Dockerfile
target: base
volumes:
- .:/overleaf/services/clsi
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
- /var/run/docker.sock:/var/run/docker.sock
working_dir: /overleaf/services/clsi
environment:
ELASTIC_SEARCH_DSN: es:9200
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
ENABLE_PDF_CACHING: "true"
PDF_CACHING_ENABLE_WORKER_POOL: "true"
ALLOWED_IMAGES: quay.io/sharelatex/texlive-full:2017.1
TEXLIVE_IMAGE: quay.io/sharelatex/texlive-full:2017.1
TEX_LIVE_IMAGE_NAME_OVERRIDE: gcr.io/overleaf-ops
TEXLIVE_IMAGE_USER: "tex"
DOCKER_RUNNER: "true"
COMPILES_HOST_DIR: $PWD/compiles
OUTPUT_HOST_DIR: $PWD/output
command: npm run --silent test:acceptance

14
services/clsi/entrypoint.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/sh
# add the node user to the docker group on the host
DOCKER_GROUP=$(stat -c '%g' /var/run/docker.sock)
groupadd --non-unique --gid "${DOCKER_GROUP}" dockeronhost
usermod -aG dockeronhost node
# compatibility: initial volume setup
mkdir -p /overleaf/services/clsi/cache && chown node:node /overleaf/services/clsi/cache
mkdir -p /overleaf/services/clsi/compiles && chown node:node /overleaf/services/clsi/compiles
mkdir -p /overleaf/services/clsi/db && chown node:node /overleaf/services/clsi/db
mkdir -p /overleaf/services/clsi/output && chown node:node /overleaf/services/clsi/output
exec runuser -u node -- "$@"

24
services/clsi/install_deps.sh Executable file
View File

@@ -0,0 +1,24 @@
#!/bin/bash
set -ex
apt-get update
apt-get install -y \
poppler-utils \
ghostscript \
rm -rf /var/lib/apt/lists/*
# Allow ImageMagick to process PDF files. This is for tests only, but since we
# use the production images for tests, this will apply to production as well.
patch /etc/ImageMagick-6/policy.xml <<EOF
--- old.xml 2022-03-23 09:16:03.985433900 -0400
+++ new.xml 2022-03-23 09:16:18.625471992 -0400
@@ -91,6 +91,5 @@
<policy domain="coder" rights="none" pattern="PS2" />
<policy domain="coder" rights="none" pattern="PS3" />
<policy domain="coder" rights="none" pattern="EPS" />
- <policy domain="coder" rights="none" pattern="PDF" />
<policy domain="coder" rights="none" pattern="XPS" />
</policymap>
EOF

41
services/clsi/kube.yaml Normal file
View File

@@ -0,0 +1,41 @@
apiVersion: v1
kind: Service
metadata:
name: clsi
namespace: default
spec:
type: LoadBalancer
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
run: clsi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: clsi
namespace: default
spec:
replicas: 2
template:
metadata:
labels:
run: clsi
spec:
containers:
- name: clsi
image: gcr.io/henry-terraform-admin/clsi
imagePullPolicy: Always
readinessProbe:
httpGet:
path: status
port: 80
periodSeconds: 5
initialDelaySeconds: 0
failureThreshold: 3
successThreshold: 1

117
services/clsi/nginx.conf Normal file
View File

@@ -0,0 +1,117 @@
# keep in sync with clsi-startup.sh files
# keep in sync with server-ce/nginx/clsi-nginx.conf
# Changes to the above:
# - added debug header
server {
# Extra header for dev-env.
add_header 'X-Served-By' 'clsi-nginx' always;
listen 8080;
server_name clsi-proxy;
server_tokens off;
access_log off;
# Ignore symlinks possibly created by users
disable_symlinks on;
# enable compression for tex auxiliary files, but not for pdf files
gzip on;
gzip_types text/plain;
gzip_proxied any;
types {
text/plain log blg aux stdout stderr;
application/pdf pdf;
}
# user content domain access check
# The project-id is zero prefixed. No actual user project uses these ids.
# mongo-id 000000000000000000000000 -> 1970-01-01T00:00:00.000Z
# mongo-id 000000010000000000000000 -> 1970-01-01T00:00:01.000Z
# mongo-id 100000000000000000000000 -> 1978-07-04T21:24:16.000Z
# This allows us to distinguish between check-traffic and regular output traffic.
location ~ ^/project/0([0-9a-f]+)/user/([0-9a-f]+)/build/([0-9a-f-]+)/output/output\.pdf$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /var/clsi/tiny.pdf;
}
location ~ ^/project/0([0-9a-f]+)/build/([0-9a-f-]+)/output/output\.pdf$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /var/clsi/tiny.pdf;
}
# handle output files for specific users
location ~ ^/project/([0-9a-f]+)/user/([0-9a-f]+)/build/([0-9a-f-]+)/output/output\.([a-z.]+)$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /output/$1-$2/generated-files/$3/output.$4;
}
# handle .blg files for specific users
location ~ ^/project/([0-9a-f]+)/user/([0-9a-f]+)/build/([0-9a-f-]+)/output/(.+)\.blg$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /output/$1-$2/generated-files/$3/$4.blg;
}
# handle output files for anonymous users
location ~ ^/project/([0-9a-f]+)/build/([0-9a-f-]+)/output/output\.([a-z.]+)$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /output/$1/generated-files/$2/output.$3;
}
# handle .blg files for anonymous users
location ~ ^/project/([0-9a-f]+)/build/([0-9a-f-]+)/output/(.+)\.blg$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
alias /output/$1/generated-files/$2/$3.blg;
}
# PDF range for specific users
location ~ ^/project/([0-9a-f]+)/user/([0-9a-f]+)/content/([0-9a-f-]+/[0-9a-f]+)$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
# Cache for one day
expires 1d;
alias /output/$1-$2/content/$3;
}
# PDF range for anonymous users
location ~ ^/project/([0-9a-f]+)/content/([0-9a-f-]+/[0-9a-f]+)$ {
if ($request_method = 'OPTIONS') {
# handle OPTIONS method for CORS requests
add_header 'Allow' 'GET,HEAD';
return 204;
}
# Cache for one day
expires 1d;
alias /output/$1/content/$2;
}
# status endpoint for haproxy httpchk option
location /status {
return 200;
}
# load shedding probe
location = /instance-state {
alias /var/clsi/instance-state;
}
}

View File

@@ -0,0 +1,52 @@
{
"name": "@overleaf/clsi",
"description": "A Node.js implementation of the CLSI LaTeX web-API",
"private": true,
"main": "app.js",
"scripts": {
"start": "node app.js",
"test:acceptance:_run": "mocha --recursive --reporter spec --timeout 15000 --exit $@ test/acceptance/js",
"test:acceptance": "npm run test:acceptance:_run -- --grep=$MOCHA_GREP",
"test:unit:_run": "mocha --recursive --reporter spec $@ test/unit/js",
"test:unit": "npm run test:unit:_run -- --grep=$MOCHA_GREP",
"nodemon": "node --watch app.js",
"lint": "eslint --max-warnings 0 --format unix .",
"format": "prettier --list-different $PWD/'**/*.*js'",
"format:fix": "prettier --write $PWD/'**/*.*js'",
"lint:fix": "eslint --fix .",
"types:check": "tsc --noEmit"
},
"dependencies": {
"@overleaf/fetch-utils": "*",
"@overleaf/logger": "*",
"@overleaf/metrics": "*",
"@overleaf/o-error": "*",
"@overleaf/promise-utils": "*",
"@overleaf/settings": "*",
"archiver": "5.3.2",
"async": "^3.2.5",
"body-parser": "^1.20.3",
"bunyan": "^1.8.15",
"dockerode": "^4.0.5",
"express": "^4.21.2",
"lodash": "^4.17.21",
"p-limit": "^3.1.0",
"request": "^2.88.2",
"send": "^0.19.0",
"tar-fs": "^3.0.4",
"workerpool": "^6.1.5"
},
"devDependencies": {
"@types/workerpool": "^6.1.0",
"chai": "^4.3.6",
"chai-as-promised": "^7.1.1",
"mocha": "^11.1.0",
"mock-fs": "^5.1.2",
"node-fetch": "^2.7.0",
"sandboxed-module": "^2.0.4",
"sinon": "~9.0.1",
"sinon-chai": "^3.7.0",
"timekeeper": "2.2.0",
"typescript": "^5.0.4"
}
}

View File

@@ -0,0 +1,3 @@
FROM quay.io/sharelatex/texlive-full:2017.1
# RUN usermod -u 1001 tex

View File

@@ -0,0 +1,12 @@
const fs = require('node:fs')
const { parseXrefTable } = require('../app/lib/pdfjs/parseXrefTable')
const pdfPath = process.argv[2]
async function main() {
const size = (await fs.promises.stat(pdfPath)).size
const { xRefEntries } = await parseXrefTable(pdfPath, size)
console.log('Xref entries', xRefEntries)
}
main().catch(console.error)

View File

@@ -0,0 +1,841 @@
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
],
"syscalls": [
{
"name": "getrandom",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "access",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "arch_prctl",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "brk",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "chdir",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "chmod",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "clock_getres",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "clock_gettime",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "clock_nanosleep",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "clone",
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 2080505856,
"valueTwo": 0,
"op": "SCMP_CMP_MASKED_EQ"
}
]
},
{
"name": "close",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "copy_file_range",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "creat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "dup",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "dup2",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "dup3",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "execve",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "execveat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "exit",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "exit_group",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "faccessat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fadvise64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fadvise64_64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fallocate",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fchdir",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fchmod",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fchmodat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fcntl",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fcntl64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fdatasync",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fork",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fstat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fstat64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fstatat64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fstatfs",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fstatfs64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fsync",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "ftruncate",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "ftruncate64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "futex",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "futimesat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getcpu",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getcwd",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getdents",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getdents64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getegid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getegid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "geteuid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "geteuid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getgid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getgid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getgroups",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getgroups32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getpgid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getpgrp",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getpid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getppid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getpriority",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getresgid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getresgid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getresuid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getresuid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getrlimit",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "get_robust_list",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getrusage",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getsid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "gettid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getuid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "getuid32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "ioctl",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "kill",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "_llseek",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "lseek",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "lstat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "lstat64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "madvise",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mkdir",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mkdirat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mmap",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mmap2",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mprotect",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "mremap",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "munmap",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "newfstatat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "open",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "openat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pause",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pipe",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pipe2",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "prctl",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pread64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "preadv",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "prlimit64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pwrite64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pwritev",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "read",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "readlink",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "readlinkat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "readv",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rename",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "renameat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "renameat2",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "restart_syscall",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rmdir",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigaction",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigpending",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigprocmask",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigqueueinfo",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigreturn",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigsuspend",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_sigtimedwait",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "rt_tgsigqueueinfo",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_getaffinity",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_getparam",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_get_priority_max",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_get_priority_min",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_getscheduler",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_rr_get_interval",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sched_yield",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sendfile",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sendfile64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "setgroups",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "setgroups32",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "set_robust_list",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "set_tid_address",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sigaltstack",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "stat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "stat64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "statfs",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "statfs64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sync",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sync_file_range",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "syncfs",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "sysinfo",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "tgkill",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "timer_create",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "timer_delete",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "timer_getoverrun",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "timer_gettime",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "timer_settime",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "times",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "tkill",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "truncate",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "truncate64",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "umask",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "uname",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "unlink",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "unlinkat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "utime",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "utimensat",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "utimes",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "vfork",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "vhangup",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "wait4",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "waitid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "write",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "writev",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "pread",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "setgid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "setuid",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "capget",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "capset",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "fchown",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "gettimeofday",
"action": "SCMP_ACT_ALLOW",
"args": []
}, {
"name": "epoll_pwait",
"action": "SCMP_ACT_ALLOW",
"args": []
}
]
}

View File

@@ -0,0 +1,34 @@
include /etc/firejail/disable-common.inc
include /etc/firejail/disable-devel.inc
# include /etc/firejail/disable-mgmt.inc ## removed in 0.9.40
# include /etc/firejail/disable-secret.inc ## removed in 0.9.40
read-only /bin
blacklist /boot
blacklist /dev
read-only /etc
blacklist /home # blacklisted for synctex
read-only /lib
read-only /lib64
blacklist /media
blacklist /mnt
blacklist /opt
blacklist /root
read-only /run
blacklist /sbin
blacklist /selinux
blacklist /src
blacklist /sys
read-only /usr
caps.drop all
noroot
nogroups
net none
private-tmp
private-dev
shell none
seccomp
nonewprivs

View File

@@ -0,0 +1,117 @@
\documentclass[12pt]{article}
% Use this form to include EPS (latex) or PDF (pdflatex) files:
\usepackage{asymptote}
% Use this form with latex or pdflatex to include inline LaTeX code by default:
%\usepackage[inline]{asymptote}
% Use this form with latex or pdflatex to create PDF attachments by default:
%\usepackage[attach]{asymptote}
% Enable this line to support the attach option:
%\usepackage[dvips]{attachfile2}
\begin{document}
% Optional subdirectory for asy files (no spaces):
\def\asydir{}
\begin{asydef}
// Global Asymptote definitions can be put here.
import three;
usepackage("bm");
texpreamble("\def\V#1{\bm{#1}}");
// One can globally override the default toolbar settings here:
// settings.toolbar=true;
\end{asydef}
Here is a venn diagram produced with Asymptote, drawn to width 4cm:
\def\A{A}
\def\B{\V{B}}
%\begin{figure}
\begin{center}
\begin{asy}
size(4cm,0);
pen colour1=red;
pen colour2=green;
pair z0=(0,0);
pair z1=(-1,0);
pair z2=(1,0);
real r=1.5;
path c1=circle(z1,r);
path c2=circle(z2,r);
fill(c1,colour1);
fill(c2,colour2);
picture intersection=new picture;
fill(intersection,c1,colour1+colour2);
clip(intersection,c2);
add(intersection);
draw(c1);
draw(c2);
//draw("$\A$",box,z1); // Requires [inline] package option.
//draw(Label("$\B$","$B$"),box,z2); // Requires [inline] package option.
draw("$A$",box,z1);
draw("$\V{B}$",box,z2);
pair z=(0,-2);
real m=3;
margin BigMargin=Margin(0,m*dot(unit(z1-z),unit(z0-z)));
draw(Label("$A\cap B$",0),conj(z)--z0,Arrow,BigMargin);
draw(Label("$A\cup B$",0),z--z0,Arrow,BigMargin);
draw(z--z1,Arrow,Margin(0,m));
draw(z--z2,Arrow,Margin(0,m));
shipout(bbox(0.25cm));
\end{asy}
%\caption{Venn diagram}\label{venn}
\end{center}
%\end{figure}
Each graph is drawn in its own environment. One can specify the width
and height to \LaTeX\ explicitly. This 3D example can be viewed
interactively either with Adobe Reader or Asymptote's fast OpenGL-based
renderer. To support {\tt latexmk}, 3D figures should specify
\verb+inline=true+. It is sometimes desirable to embed 3D files as annotated
attachments; this requires the \verb+attach=true+ option as well as the
\verb+attachfile2+ \LaTeX\ package.
\begin{center}
\begin{asy}[height=4cm,inline=true,attach=false,viewportwidth=\linewidth]
currentprojection=orthographic(5,4,2);
draw(unitcube,blue);
label("$V-E+F=2$",(0,1,0.5),3Y,blue+fontsize(17pt));
\end{asy}
\end{center}
One can also scale the figure to the full line width:
\begin{center}
\begin{asy}[width=\the\linewidth,inline=true]
pair z0=(0,0);
pair z1=(2,0);
pair z2=(5,0);
pair zf=z1+0.75*(z2-z1);
draw(z1--z2);
dot(z1,red+0.15cm);
dot(z2,darkgreen+0.3cm);
label("$m$",z1,1.2N,red);
label("$M$",z2,1.5N,darkgreen);
label("$\hat{\ }$",zf,0.2*S,fontsize(24pt)+blue);
pair s=-0.2*I;
draw("$x$",z0+s--z1+s,N,red,Arrows,Bars,PenMargins);
s=-0.5*I;
draw("$\bar{x}$",z0+s--zf+s,blue,Arrows,Bars,PenMargins);
s=-0.95*I;
draw("$X$",z0+s--z2+s,darkgreen,Arrows,Bars,PenMargins);
\end{asy}
\end{center}
\end{document}

View File

@@ -0,0 +1,81 @@
1/0: uncompressed; offset = 123103
2/0: uncompressed; offset = 123422
3/0: uncompressed; offset = 15
4/0: uncompressed; offset = 216
5/0: uncompressed; offset = 1084
6/0: uncompressed; offset = 1244
7/0: uncompressed; offset = 4001
8/0: uncompressed; offset = 4155
9/0: uncompressed; offset = 4297
10/0: uncompressed; offset = 4933
11/0: uncompressed; offset = 5309
12/0: uncompressed; offset = 5498
13/0: uncompressed; offset = 30250
14/0: uncompressed; offset = 31471
15/0: uncompressed; offset = 38404
16/0: uncompressed; offset = 39046
17/0: uncompressed; offset = 40166
18/0: uncompressed; offset = 40906
19/0: uncompressed; offset = 65560
20/0: uncompressed; offset = 74702
21/0: uncompressed; offset = 81705
22/0: uncompressed; offset = 97182
23/0: uncompressed; offset = 104117
24/0: uncompressed; offset = 111195
25/0: uncompressed; offset = 118571
26/0: compressed; stream = 6, index = 0
27/0: compressed; stream = 6, index = 1
28/0: compressed; stream = 6, index = 2
29/0: compressed; stream = 6, index = 3
30/0: compressed; stream = 6, index = 4
31/0: compressed; stream = 6, index = 5
32/0: compressed; stream = 6, index = 6
33/0: compressed; stream = 6, index = 7
34/0: compressed; stream = 6, index = 8
35/0: compressed; stream = 6, index = 9
36/0: compressed; stream = 6, index = 10
37/0: compressed; stream = 6, index = 11
38/0: compressed; stream = 6, index = 12
39/0: compressed; stream = 6, index = 13
40/0: compressed; stream = 6, index = 14
41/0: compressed; stream = 6, index = 15
42/0: compressed; stream = 6, index = 16
43/0: compressed; stream = 6, index = 17
44/0: compressed; stream = 6, index = 18
45/0: compressed; stream = 6, index = 19
46/0: compressed; stream = 6, index = 20
47/0: compressed; stream = 6, index = 21
48/0: compressed; stream = 6, index = 22
49/0: compressed; stream = 6, index = 23
50/0: compressed; stream = 6, index = 24
51/0: compressed; stream = 6, index = 25
52/0: compressed; stream = 6, index = 26
53/0: compressed; stream = 6, index = 27
54/0: compressed; stream = 6, index = 28
55/0: compressed; stream = 6, index = 29
56/0: compressed; stream = 6, index = 30
57/0: compressed; stream = 6, index = 31
58/0: compressed; stream = 6, index = 32
59/0: compressed; stream = 6, index = 33
60/0: compressed; stream = 6, index = 34
61/0: compressed; stream = 6, index = 35
62/0: compressed; stream = 6, index = 36
63/0: compressed; stream = 6, index = 37
64/0: compressed; stream = 6, index = 38
65/0: compressed; stream = 6, index = 39
66/0: compressed; stream = 6, index = 40
67/0: compressed; stream = 6, index = 41
68/0: compressed; stream = 6, index = 42
69/0: compressed; stream = 6, index = 43
70/0: compressed; stream = 6, index = 44
71/0: compressed; stream = 6, index = 45
72/0: compressed; stream = 6, index = 46
73/0: compressed; stream = 6, index = 47
74/0: compressed; stream = 6, index = 48
75/0: compressed; stream = 6, index = 49
76/0: compressed; stream = 6, index = 50
77/0: compressed; stream = 6, index = 51
78/0: compressed; stream = 6, index = 52
79/0: compressed; stream = 6, index = 53
80/0: compressed; stream = 6, index = 54
81/0: compressed; stream = 6, index = 55

View File

@@ -0,0 +1,9 @@
@book{DouglasAdams,
title={The Hitchhiker's Guide to the Galaxy},
author={Adams, Douglas},
isbn={9781417642595},
url={http://books.google.com/books?id=W-xMPgAACAAJ},
year={1995},
publisher={San Val}
}

View File

@@ -0,0 +1,12 @@
\documentclass{article}
\usepackage[backend=biber]{biblatex}
\addbibresource{bibliography.bib}
\begin{document}
The meaning of life, the universe and everything is 42 \cite{DouglasAdams}
\printbibliography
\end{document}

View File

@@ -0,0 +1,48 @@
% $ biblatex auxiliary file $
% $ biblatex version 1.5 $
% $ biber version 0.9.3 $
% Do not modify the above lines!
%
% This is an auxiliary file used by the 'biblatex' package.
% This file may safely be deleted. It will be recreated by
% biber or bibtex as required.
%
\begingroup
\makeatletter
\@ifundefined{ver@biblatex.sty}
{\@latex@error
{Missing 'biblatex' package}
{The bibliography requires the 'biblatex' package.}
\aftergroup\endinput}
{}
\endgroup
\refsection{0}
\entry{DouglasAdams}{book}{}
\name{labelname}{1}{}{%
{{}{Adams}{A\bibinitperiod}{Douglas}{D\bibinitperiod}{}{}{}{}}%
}
\name{author}{1}{}{%
{{}{Adams}{A\bibinitperiod}{Douglas}{D\bibinitperiod}{}{}{}{}}%
}
\list{publisher}{1}{%
{San Val}%
}
\strng{namehash}{AD1}
\strng{fullhash}{AD1}
\field{sortinit}{A}
\field{isbn}{9781417642595}
\field{title}{The Hitchhiker's Guide to the Galaxy}
\field{year}{1995}
\verb{url}
\verb http://books.google.com/books?id=W-xMPgAACAAJ
\endverb
\endentry
\lossort
\endlossort
\endrefsection
\endinput

View File

@@ -0,0 +1,31 @@
1/0: uncompressed; offset = 59313
2/0: uncompressed; offset = 59561
3/0: uncompressed; offset = 15
4/0: uncompressed; offset = 216
5/0: uncompressed; offset = 734
6/0: uncompressed; offset = 784
7/0: uncompressed; offset = 913
8/0: uncompressed; offset = 1028
9/0: uncompressed; offset = 1528
10/0: uncompressed; offset = 9787
11/0: uncompressed; offset = 18282
12/0: uncompressed; offset = 33607
13/0: uncompressed; offset = 45579
14/0: uncompressed; offset = 58005
15/0: compressed; stream = 14, index = 0
16/0: compressed; stream = 14, index = 1
17/0: compressed; stream = 14, index = 2
18/0: compressed; stream = 14, index = 3
19/0: compressed; stream = 14, index = 4
20/0: compressed; stream = 14, index = 5
21/0: compressed; stream = 14, index = 6
22/0: compressed; stream = 14, index = 7
23/0: compressed; stream = 14, index = 8
24/0: compressed; stream = 14, index = 9
25/0: compressed; stream = 14, index = 10
26/0: compressed; stream = 14, index = 11
27/0: compressed; stream = 14, index = 12
28/0: compressed; stream = 14, index = 13
29/0: compressed; stream = 14, index = 14
30/0: compressed; stream = 14, index = 15
31/0: compressed; stream = 14, index = 16

View File

@@ -0,0 +1,84 @@
<?xml version="1.0" standalone="yes"?>
<!-- logreq request file -->
<!-- logreq version 1.0 / dtd version 1.0 -->
<!-- Do not edit this file! -->
<!DOCTYPE requests [
<!ELEMENT requests (internal | external)*>
<!ELEMENT internal (generic, (provides | requires)*)>
<!ELEMENT external (generic, cmdline?, input?, output?, (provides | requires)*)>
<!ELEMENT cmdline (binary, (option | infile | outfile)*)>
<!ELEMENT input (file)+>
<!ELEMENT output (file)+>
<!ELEMENT provides (file)+>
<!ELEMENT requires (file)+>
<!ELEMENT generic (#PCDATA)>
<!ELEMENT binary (#PCDATA)>
<!ELEMENT option (#PCDATA)>
<!ELEMENT infile (#PCDATA)>
<!ELEMENT outfile (#PCDATA)>
<!ELEMENT file (#PCDATA)>
<!ATTLIST requests
version CDATA #REQUIRED
>
<!ATTLIST internal
package CDATA #REQUIRED
priority (9) #REQUIRED
active (0 | 1) #REQUIRED
>
<!ATTLIST external
package CDATA #REQUIRED
priority (1 | 2 | 3 | 4 | 5 | 6 | 7 | 8) #REQUIRED
active (0 | 1) #REQUIRED
>
<!ATTLIST provides
type (static | dynamic | editable) #REQUIRED
>
<!ATTLIST requires
type (static | dynamic | editable) #REQUIRED
>
<!ATTLIST file
type CDATA #IMPLIED
>
]>
<requests version="1.0">
<internal package="biblatex" priority="9" active="0">
<generic>latex</generic>
<provides type="dynamic">
<file>output.bcf</file>
</provides>
<requires type="dynamic">
<file>output.bbl</file>
</requires>
<requires type="static">
<file>blx-compat.def</file>
<file>biblatex.def</file>
<file>numeric.bbx</file>
<file>standard.bbx</file>
<file>numeric.cbx</file>
<file>biblatex.cfg</file>
<file>english.lbx</file>
</requires>
</internal>
<external package="biblatex" priority="5" active="0">
<generic>biber</generic>
<cmdline>
<binary>biber</binary>
<infile>output</infile>
</cmdline>
<input>
<file>output.bcf</file>
</input>
<output>
<file>output.bbl</file>
</output>
<provides type="dynamic">
<file>output.bbl</file>
</provides>
<requires type="dynamic">
<file>output.bcf</file>
</requires>
<requires type="editable">
<file>bibliography.bib</file>
</requires>
</external>
</requests>

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

View File

@@ -0,0 +1,25 @@
\documentclass{article}
\usepackage{graphics}
\title{Your Paper}
\author{You}
\begin{document}
\maketitle
\begin{abstract}
Your abstract.
\end{abstract}
\section{Introduction}
This is the start of the document.
\begin{figure}[ht]
\includegraphics[0,0][100,100]{frog.jpg}
\end{figure}
This is the end of the document.
\end{document}

Some files were not shown because too many files have changed in this diff Show More