mirror of https://github.com/kiwix/kiwix-tools.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
467 lines
12 KiB
467 lines
12 KiB
***********
|
|
kiwix-serve
|
|
***********
|
|
|
|
Introduction
|
|
============
|
|
|
|
``kiwix-serve`` is a tool for serving ZIM file content over HTTP. It supports
|
|
serving a library containing multiple ZIM files. In a large library served by a
|
|
``kiwix-serve`` instance clients can look up/filter ZIM files of interest by
|
|
words in their :term:`titles <ZIM title>` and/or descriptions, language, tags, etc.
|
|
|
|
``kiwix-serve`` provides a ZIM file viewer for displaying inidividual pages
|
|
from a ZIM file inside the user's web browser (without downloading the full ZIM
|
|
file).
|
|
|
|
Clients can also remotely search inside those ZIM files that contain a full-text
|
|
search database.
|
|
|
|
|
|
Usage
|
|
=====
|
|
|
|
.. code-block:: sh
|
|
|
|
kiwix-serve --library [OPTIONS] LIBRARY_FILE_PATH
|
|
kiwix-serve [OPTIONS] ZIM_FILE_PATH ...
|
|
|
|
|
|
Arguments
|
|
---------
|
|
|
|
.. _cli-arg-library-file-path:
|
|
|
|
``LIBRARY_FILE_PATH``: path of an XML library file listing ZIM files to serve.
|
|
To be used only with the :option:`--library` option. Multiple library files can
|
|
be provided as a semicolon (``;``) separated list.
|
|
|
|
``ZIM_FILE_PATH``: ZIM file path (multiple arguments are allowed).
|
|
|
|
Options
|
|
-------
|
|
|
|
.. option:: --library
|
|
|
|
By default, ``kiwix-serve`` expects a list of ZIM files as command line
|
|
arguments. Providing the :option:`--library` option tells ``kiwix-serve``
|
|
that the command line argument is rather a :ref:`library XML file
|
|
<cli-arg-library-file-path>`.
|
|
|
|
.. option:: -i ADDR, --address=ADDR
|
|
|
|
Listen only on this IP address. By default the server listens on all
|
|
available IP addresses.
|
|
|
|
|
|
.. option:: -p PORT, --port=PORT
|
|
|
|
TCP port on which to listen for HTTP requests (default: 80).
|
|
|
|
|
|
.. option:: -r ROOT, --urlRootLocation=ROOT
|
|
|
|
URL prefix on which the content should be made available (default: empty).
|
|
|
|
|
|
.. option:: -d, --daemon
|
|
|
|
Detach the HTTP server daemon from the main process.
|
|
|
|
|
|
.. option:: -a PID, --attachToProcess=PID
|
|
|
|
Exit when the process with id PID stops running.
|
|
|
|
|
|
.. option:: -M, --monitorLibrary
|
|
|
|
Monitor the XML library file and reload it automatically when it changes.
|
|
|
|
Library reloading can be forced anytime by sending a SIGHUP signal to the
|
|
``kiwix-serve`` process (this works regardless of the presence of the
|
|
:option:`--monitorLibrary`/:option:`-M` option).
|
|
|
|
|
|
.. option:: -m, --nolibrarybutton
|
|
|
|
Disable the library home button in the ZIM viewer toolbar.
|
|
|
|
|
|
.. option:: -n, --nosearchbar
|
|
|
|
Disable the searchbox in the ZIM viewer toolbar.
|
|
|
|
|
|
.. option:: -b, --blockexternal
|
|
|
|
Prevent the users from directly navigating to external resources via such
|
|
links in ZIM content.
|
|
|
|
|
|
.. option:: -t N, --threads=N
|
|
|
|
Number of threads to run in parallel (default: 4).
|
|
|
|
|
|
.. option:: -s N, --searchLimit=N
|
|
|
|
Maximum number of ZIM files in a fulltext multizim search (default: No limit).
|
|
|
|
|
|
.. option:: -z, --nodatealiases
|
|
|
|
Create URL aliases for each content by removing the date embedded in the file
|
|
name. The expected format of the date in the filename is ``_YYYY-MM``. For
|
|
example, ZIM file ``wikipedia_en_all_2020-08.zim`` will be accessible both as
|
|
``wikipedia_en_all_2020-08`` and ``wikipedia_en_all``.
|
|
|
|
|
|
.. option:: -c PATH, --customIndex=PATH
|
|
|
|
Override the welcome page with a custom HTML file.
|
|
|
|
|
|
.. option:: -L N, --ipConnectionLimit=N
|
|
|
|
Max number of (concurrent) connections per IP (default: infinite,
|
|
recommended: >= 6).
|
|
|
|
|
|
.. option:: -v, --verbose
|
|
|
|
Print debug log to STDOUT.
|
|
|
|
|
|
.. option:: -V, --version
|
|
|
|
Print the software version.
|
|
|
|
|
|
.. option:: -h, --help
|
|
|
|
Print the help text.
|
|
|
|
|
|
HTTP API
|
|
========
|
|
|
|
``kiwix-serve`` serves content at/under ``http://ADDR:PORT/ROOT`` where
|
|
``ADDR``, ``PORT`` and ``ROOT`` are the values supplied to the
|
|
:option:`--address`/:option:`-i`, :option:`--port`/:option:`-p` and
|
|
:option:`--urlRootLocation`/:option:`-r` options, respectively.
|
|
|
|
HTTP API endpoints presented below are relative to that location, i.e.
|
|
``/foo/bar`` must be actually accessed as ``http://ADDR:PORT/ROOT/foo/bar``.
|
|
|
|
.. _welcome-page:
|
|
|
|
``/``
|
|
-----
|
|
|
|
Welcome page is served under ``/``. By default this is the library page, where
|
|
books are listed and can be looked up/filtered interactively. However, the
|
|
welcome page can be overriden through the :option:`--customIndex`/:option:`-c`
|
|
command line option of ``kiwix-serve``.
|
|
|
|
|
|
.. _new-opds-api:
|
|
|
|
``/catalog/v2`` (OPDS API)
|
|
------------------------------
|
|
|
|
The new OPDS API of ``kiwix-serve`` is based on the `OPDS Catalog specification
|
|
v1.2 <https://specs.opds.io/opds-1.2>`_. All of its endpoints are grouped under
|
|
``/catalog/v2``.
|
|
|
|
:ref:`Legacy OPDS API <legacy-opds-api>` is preserved for backward
|
|
compatibility.
|
|
|
|
|
|
``/catalog/v2/root.xml``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The OPDS Catalog Root links to the OPDS acquisition and navigation feeds
|
|
accessible through the other endpoints of the OPDS API.
|
|
|
|
|
|
``/catalog/v2/searchdescription.xml``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Describes the `/catalog/v2/entries`_ endpoint in `OpenSearch description format
|
|
<https://developer.mozilla.org/en-US/docs/Web/OpenSearch>`_.
|
|
|
|
|
|
|
|
``/catalog/v2/categories``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the full list of ZIM file categories as an `OPDS Navigation Feed
|
|
<https://specs.opds.io/opds-1.2#22-navigation-feeds>`_.
|
|
|
|
|
|
``/catalog/v2/entries``
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the full or filtered list of ZIM files as an `OPDS acquisition feed
|
|
<https://specs.opds.io/opds-1.2#23-acquisition-feeds>`_ with `complete entries
|
|
<https://specs.opds.io/opds-1.2#512-partial-and-complete-catalog-entries>`_.
|
|
|
|
By default, all entries in the library are returned. A subset can be requested
|
|
by providing one or more filtering criteria, whereupon only entries matching
|
|
*all* of the criteria are included in the response. The filtering criteria must
|
|
be specified as URL search parameters.
|
|
|
|
* ``lang`` - filter by language (specified as a 3-letter language code).
|
|
|
|
* ``category`` - filter by categories associated with the library entries.
|
|
|
|
* ``tag`` - filter by tags associated with the library entries. Multiple tags
|
|
can be provided as a semicolon separated list (e.g
|
|
``tag=wikipedia;_videos:no``). The result will contain only those entries
|
|
that contain *all* of the requested tags.
|
|
|
|
* ``notag`` - filter out (exclude) entries with *any* of the specified tags
|
|
(example - ``notag=ted;youtube``).
|
|
|
|
* ``maxsize`` - include in the results only entries whose size (in bytes)
|
|
doesn't exceed the provided value.
|
|
|
|
* ``q`` - include in the results only entries that contain the specified text
|
|
in the title or description.
|
|
|
|
* ``name`` - include in the results only the entry with the specified
|
|
:term:`name <ZIM name>`.
|
|
|
|
* ``start=s`` and ``count=n`` - these parameters enable pagination of the
|
|
search/filtering results - the feed will contain (at most) ``n`` results
|
|
starting from the result # ``s`` (0-based).
|
|
|
|
|
|
**Examples:**
|
|
|
|
.. code:: sh
|
|
|
|
# List only books in Italian (lang=ita) but
|
|
# return only results ## 100-149 (start=100&count=50)
|
|
$ curl 'http://localhost:8080/catalog/v2/entries?lang=ita&start=100&count=50'
|
|
|
|
# List only books with category of 'wikipedia' AND containing the word
|
|
# 'science' in the title or description
|
|
$ curl 'http://localhost:8080/catalog/v2/entries?q=science&category=wikipedia'
|
|
|
|
``/catalog/v2/entry/ZIMID``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns full info about the library entry with :term:`UUID <ZIM UUID>`
|
|
``ZIMID``.
|
|
|
|
|
|
``/catalog/v2/illustration/ZIMID``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
**Usage:**
|
|
|
|
``/catalog/v2/illustration/ZIMID?size=N``
|
|
|
|
Returns the illustration of size ``NxN`` pixels for the library entry with
|
|
:term:`UUID <ZIM UUID>` ``ZIMID``.
|
|
|
|
If no illustration of requested size is found a HTTP 404 error is returned.
|
|
|
|
|
|
``/catalog/v2/languages``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the full list of ZIM file languages as an `OPDS Navigation Feed
|
|
<https://specs.opds.io/opds-1.2#22-navigation-feeds>`_.
|
|
|
|
|
|
``/catalog/v2/partial_entries``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the full or filtered list of ZIM files as an `OPDS acquisition feed
|
|
<https://specs.opds.io/opds-1.2#23-acquisition-feeds>`_ with `partial entries
|
|
<https://specs.opds.io/opds-1.2#512-partial-and-complete-catalog-entries>`_.
|
|
|
|
Supported filters are the same as for the `/catalog/v2/entries`_ endpoint.
|
|
|
|
|
|
.. _legacy-opds-api:
|
|
|
|
``/catalog`` (Legacy OPDS API)
|
|
------------------------------
|
|
|
|
The legacy OPDS API is preserved for backward compatibility and is deprecated.
|
|
:ref:`New OPDS API <new-opds-api>` should be used instead.
|
|
|
|
|
|
``/catalog/root.xml``
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Full library OPDS catalog (list of all ZIM files).
|
|
|
|
|
|
``/catalog/searchdescription.xml``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Describes the `/catalog/search`_ endpoint in `OpenSearch description format
|
|
<https://developer.mozilla.org/en-US/docs/Web/OpenSearch>`_.
|
|
|
|
|
|
``/catalog/search``
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the list of ZIM files (in OPDS catalog format) matching the
|
|
search/filtering criteria. Supported filters are the same as for the
|
|
`/catalog/v2/entries`_ endpoint.
|
|
|
|
|
|
``/catch``
|
|
----------
|
|
|
|
Blablabla
|
|
|
|
|
|
``/content``
|
|
------------
|
|
|
|
ZIM file content is served under the ``/content`` endpoint as described below.
|
|
|
|
|
|
``/content/ZIMNAME/PATH/IN/ZIMFILE``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Returns the entry with path ``PATH/IN/ZIMFILE`` from ZIM file with :term:`name
|
|
<ZIM name>` ``ZIMNAME``.
|
|
|
|
|
|
``/content/ZIMNAME``
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
``/content/ZIMNAME`` redirects to the main page of the ZIM file with :term:`name
|
|
<ZIM name>` ``ZIMNAME`` (unless that ZIM file contains an entry with an empty
|
|
path or path equal to ``/``, in which case that entry is returned).
|
|
|
|
|
|
``/random``
|
|
-----------
|
|
|
|
**Usage:**
|
|
|
|
``/random?content=ZIMNAME``
|
|
|
|
Generates a HTTP redirect to a randomly selected article/page from the
|
|
specified ZIM file.
|
|
|
|
**Parameters:**
|
|
|
|
``content``: :term:`name <ZIM name>` of the ZIM file.
|
|
|
|
|
|
``/raw``
|
|
--------
|
|
|
|
Blablabla
|
|
|
|
|
|
``/search``
|
|
-----------
|
|
|
|
Blablabla
|
|
|
|
|
|
``/skin``
|
|
-----------
|
|
|
|
Static front-end resources (such as CSS, javascript and images) are all grouped
|
|
under ``/skin``.
|
|
|
|
**Usage:**
|
|
``/skin/PATH/TO/RESOURCE[?cacheid=CACHEID]``
|
|
|
|
`Cache busting
|
|
<https://javascript.plainenglish.io/what-is-cache-busting-55366b3ac022>`_ of
|
|
static resources is supported via the optional param ``cacheid``. By default,
|
|
i.e. when the ``cacheid`` parameter is not specified while accessing the
|
|
``/skin`` endpoint, static resources are served as if they were dynamic (i.e.
|
|
could be different for an immediately repeated request). Specifying the
|
|
``cacheid`` parameter with a correct value (matching the value embedded in the
|
|
``kiwix-serve`` instance), makes the returned resource to be presented as
|
|
immutable. However, if the value of the ``cacheid`` parameter mismatches then
|
|
``kiwix-serve`` responds with a 404 HTTP error.
|
|
|
|
``kiwix-serve``'s default front-end (the :ref:`welcome page <welcome-page>` and
|
|
the :ref:`ZIM file viewer <zim-file-viewer>`) access all underlying static
|
|
resources by using explicit ``cacheid`` s.
|
|
|
|
|
|
``/suggest``
|
|
------------
|
|
|
|
Blablabla
|
|
|
|
|
|
.. _zim-file-viewer:
|
|
|
|
``/viewer``
|
|
-----------
|
|
|
|
ZIM file viewer. The ZIM file and entry therein must be specified via the hash
|
|
component of the URL as ``/viewer#ZIMNAME/PATH/IN/ZIMFILE``.
|
|
|
|
|
|
``/viewer_settings.js``
|
|
-----------------------
|
|
|
|
Settings of the ZIM file viewer that are configurable via certain command line
|
|
options of ``kiwix-serve`` (e.g. ``--nolibrarybutton``).
|
|
|
|
|
|
/ANYTHING/ELSE
|
|
--------------
|
|
|
|
Any other URL is considered as an attempt to access ZIM file content using the
|
|
legacy URL scheme and is redirected to ``/content/ANYTHING/ELSE``.
|
|
|
|
|
|
Glossary
|
|
========
|
|
|
|
.. glossary::
|
|
|
|
ZIM filename
|
|
|
|
Name of a ZIM file on the server filesystem.
|
|
|
|
ZIM name
|
|
|
|
Identifier of a ZIM file in the server's library (used for referring to a
|
|
particular ZIM file in requests).
|
|
|
|
For a ``kiwix-serve`` started with a list of ZIM files, ZIM names are
|
|
derived from the filename by dropping the extension and replacing certain
|
|
characters (spaces are replaced with underscores, and ``+`` symbols are
|
|
replaced with the text ``plus``). Presence of the
|
|
:option:`-z`/:option:`--nodatealiases` option will create additional names
|
|
(aliases) for filenames with dates.
|
|
|
|
For a ``kiwix-serve`` started with the :option:`--library` option, ZIM
|
|
names come from the library XML file.
|
|
|
|
ZIM names are expected to be unique across the library. Any name conflicts
|
|
(including those caused by the usage of the
|
|
:option:`-z`/:option:`--nodatealiases` option) are reported on STDERR but,
|
|
otherwise, are ignored.
|
|
|
|
ZIM title
|
|
|
|
Title of a ZIM file. This can be any text (with whitespace). It is never
|
|
used as a way of referring to a ZIM file.
|
|
|
|
ZIM UUID
|
|
|
|
This is a unique identifier of a ZIM file designated at its creation time
|
|
and embedded in the ZIM file. Certain ``kiwix-serve`` operations may
|
|
require that a ZIM file be referenced through its UUID rather than name.
|