mirror of https://github.com/sipwise/kamailio.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
243 lines
8.8 KiB
243 lines
8.8 KiB
DB Cassandra Module
|
|
|
|
Anca Vamanu
|
|
|
|
<anca.vamanu@1and1.ro>
|
|
|
|
Edited by
|
|
|
|
Anca Vamanu
|
|
|
|
<anca.vamanu@1and1.ro>
|
|
|
|
Copyright © 2012 1&1 Internet AG
|
|
__________________________________________________________________
|
|
|
|
Table of Contents
|
|
|
|
1. Admin Guide
|
|
|
|
1. Overview
|
|
2. Dependencies
|
|
|
|
2.1. SIP Router Modules
|
|
2.2. External Libraries or Applications
|
|
|
|
3. Parameters
|
|
|
|
3.1. schema_path (string)
|
|
|
|
4. Functions
|
|
5. Installation
|
|
6. Table schema
|
|
7. Limitations
|
|
|
|
List of Examples
|
|
|
|
1.1. Set schema_path parameter
|
|
|
|
Chapter 1. Admin Guide
|
|
|
|
Table of Contents
|
|
|
|
1. Overview
|
|
2. Dependencies
|
|
|
|
2.1. SIP Router Modules
|
|
2.2. External Libraries or Applications
|
|
|
|
3. Parameters
|
|
|
|
3.1. schema_path (string)
|
|
|
|
4. Functions
|
|
5. Installation
|
|
6. Table schema
|
|
7. Limitations
|
|
|
|
1. Overview
|
|
|
|
Db_cassandra is one of the SIP Router database modules. It does not
|
|
export any functions executable from the configuration scripts, but it
|
|
exports a subset of functions from the database API, and thus, other
|
|
modules can use it as a database driver, instead of, for example, the
|
|
Mysql module.
|
|
|
|
The storage backend is a Cassandra cluster and this module provides an
|
|
SQL interface to be used by other modules for storing and retrieving
|
|
data. Because Cassandra is a NoSQL distributed system, there are
|
|
limitations on the operations that can be performed. The limitations
|
|
concern the indexes on which queries are performed, as it is only
|
|
possible to have simple conditions (equality comparison only) and only
|
|
two indexing levels. These issues will be explained in an example
|
|
below.
|
|
|
|
Cassandra DB is especially suited for storing large data or data that
|
|
requires distribution, redundancy or replication. One usage example is
|
|
a distributed location system in a platform that has a cluster of SIP
|
|
Router servers, with more proxies and registration servers accessing
|
|
the same location database. This was actually the main use case we had
|
|
in mind when implementing this module. Please NOTE that it has only
|
|
been tested with the usrloc, auth_db and domain modules.
|
|
|
|
You can find a configuration file example for this usage in the module
|
|
- kamailio_cassa.cfg.
|
|
|
|
Because the module has to do the translation from SQL to Cassandra
|
|
NoSQL queries, the schemas for the tables must be known by the module.
|
|
You will find the schemas for location, subscriber and version tables
|
|
in utils/kamctl/dbcassandra directory. You have to provide the path to
|
|
the directory containing the table definitions by setting the module
|
|
parameter schema_path.
|
|
|
|
There is no need to configure a table metadata in Cassandra cluster.
|
|
You only need to define a keyspace with the name of the database and
|
|
for each table a column family inside that keyspace with the name of
|
|
the table. The comparator and validators should be either UTF8Type or
|
|
ASCIIType. Example:
|
|
...
|
|
create keyspace openser;
|
|
use openser;
|
|
create column family 'location' with comparator='UTF8Type' and
|
|
default_validation_class='UTF8Type' and key_validation_class='UTF8Type';
|
|
...
|
|
|
|
Special attention was given to performance in Cassandra. Therefore, the
|
|
implementation uses only the native row indexing in Cassandra and no
|
|
secondary indexes, because they are costly. Instead, we simulate a
|
|
secondary index by using the column names and putting information in
|
|
them, which is very efficient. Also, for deleting expired records, we
|
|
let Cassandra take care of this with its own mechanism (by setting the
|
|
TTL for columns).
|
|
|
|
2. Dependencies
|
|
|
|
2.1. SIP Router Modules
|
|
2.2. External Libraries or Applications
|
|
|
|
2.1. SIP Router Modules
|
|
|
|
The following modules must be loaded before this module:
|
|
* No dependencies on other SIP Router modules.
|
|
|
|
2.2. External Libraries or Applications
|
|
|
|
The following libraries or applications must be installed before
|
|
running SIP Router with this module loaded:
|
|
* Thrift library version 0.6.1 .
|
|
|
|
The implementation was tested with Cassandra version 1.0.1 .
|
|
|
|
3. Parameters
|
|
|
|
3.1. schema_path (string)
|
|
|
|
3.1. schema_path (string)
|
|
|
|
The directory where the files with the table schemas are located. This
|
|
directory has to contain the subdirectories corresponding to the
|
|
databases' names (name of the directory = name of the database). These
|
|
directories, in turn, contain the files with the table schemas. See the
|
|
schemas in utils/kamctl/dbcassandra directory.
|
|
|
|
Example 1.1. Set schema_path parameter
|
|
...
|
|
modparam("db_cassandra", "schema_path",
|
|
"/usr/local/sip-router/etc/kamctl/dbcassandra")
|
|
...
|
|
|
|
4. Functions
|
|
|
|
NONE
|
|
|
|
5. Installation
|
|
|
|
Because it dependes on an external library, the db_cassandra module is
|
|
not compiled and installed by default. You can use one of these
|
|
options:
|
|
* - edit the "Makefile" and remove "db_cassandra" from
|
|
"excluded_modules" list. Then follow the standard procedure to
|
|
install SIP Router: "make all; make install".
|
|
* - from command line, run: 'make all include_modules="db_cassandra";
|
|
make install include_modules="db_cassandra"'.
|
|
|
|
6. Table schema
|
|
|
|
The module must know the table schema for the tables that will be used.
|
|
You must configure the path to the schema directory by setting the
|
|
schema_path parameter.
|
|
|
|
A table schema document has the following structure:
|
|
* First row: the name and type of the columns in the form name(type)
|
|
separated by spaces. The possible types are: string, int, double
|
|
and timestamp.
|
|
Thetimestamp type has a special meaning. Only one column of this
|
|
type can be defined for a table, and it should contain the expiry
|
|
time for that record. If defined this value will be used to compute
|
|
the ttl for the columns and Cassandra will automatically delete the
|
|
columns when they expire. Because we want the ttl to have meaning
|
|
for the entire record, we must ensure that when the ttl is updated,
|
|
it is updated for all columns for that record. In other words, to
|
|
update the expiration time of a record, an insert operation must be
|
|
performed from the point of view of the db_cassandra module
|
|
("insert" in Cassandra means "replace if exists or insert new
|
|
record otherwise"). So, if you define a table with a timestamp
|
|
column, the update operations on that table that also update the
|
|
timestamp must update all columns. So, these update operations must
|
|
in fact be insert operations.
|
|
* Second row: the columns that form the row key separated by space.
|
|
* Third row: the columns that form the secondary key separated by
|
|
space.
|
|
|
|
Bellow you can see the schema for the location table:
|
|
|
|
...
|
|
callid(string) cflags(int) contact(string) cseq(int) expires(timestamp) flags
|
|
(int) last_modified(int) methods(int) path(string) q(double) received(string) so
|
|
cket(string) user_agent(string) username(string) ruid(string) instance(string) r
|
|
eg_id(int)
|
|
username
|
|
contact
|
|
...
|
|
|
|
Observe first that the row key is the username and the secondary index
|
|
is the contact. We have also defined a timestamp column - expires. In
|
|
this example, both the row key and the secondary index are defined by
|
|
only one column, but they can be formed out of more columns. You can
|
|
list them separated by space.
|
|
|
|
To understand why the schema looks like this, we must first see which
|
|
queries are performed on the location table. (The 'callid' condition
|
|
was ignored as it doesn't really have a well defined role in the SIP
|
|
RFC).
|
|
* When Invite received, lookup location: select where username='..'.
|
|
* When Register received, update registration: update where
|
|
username='..' and contact='..'.
|
|
|
|
So, the relation between these keys is the following:
|
|
* The unique key for a table is actually the combination of row key +
|
|
secondary key.
|
|
* A row defined by a row key will contain more records with different
|
|
secondary keys.
|
|
|
|
The timestamp column that leaves the Cassandra cluster to deal with
|
|
deleting expired record can be used only with a modification to the
|
|
usrloc module that replaces the update performed at re-registration
|
|
with an insert operation (so that all columns are updated). This
|
|
behavior can be enabled by setting a parameter in the usrloc module
|
|
db_update_as_insert:
|
|
|
|
...
|
|
modparam("usrloc", "db_update_as_insert", 1)
|
|
...
|
|
|
|
The alternative would have been to define an index on the expire column
|
|
and run a external job to periodically delete the expired records.
|
|
However, obviously, this would be more costly.
|
|
|
|
7. Limitations
|
|
|
|
The module can be used only when the queries use only one index, which
|
|
is also the unique key, or have two indexes that form the unique key
|
|
like in the usrloc usage.
|