mirror of https://github.com/sipwise/kamailio.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
|
13 years ago | |
|---|---|---|
| .. | ||
| doc | 13 years ago | |
| Cassandra.cpp | 13 years ago | |
| Cassandra.h | 13 years ago | |
| Makefile | 13 years ago | |
| README | 13 years ago | |
| cassandra_constants.cpp | 13 years ago | |
| cassandra_constants.h | 13 years ago | |
| cassandra_types.cpp | 13 years ago | |
| cassandra_types.h | 13 years ago | |
| db_cassandra.c | 13 years ago | |
| dbcassa_base.cpp | 13 years ago | |
| dbcassa_base.h | 13 years ago | |
| dbcassa_table.c | 13 years ago | |
| dbcassa_table.h | 13 years ago | |
| kamailio_cassa.cfg | 13 years ago | |
README
DB Cassandra Module
Anca Vamanu
<anca.vamanu@1and1.ro>
Edited by
Anca Vamanu
<anca.vamanu@1and1.ro>
Copyright © 2012 1&1 Internet AG
__________________________________________________________________
Table of Contents
1. Admin Guide
1. Overview
2. Dependencies
2.1. SIP Router Modules
2.2. External Libraries or Applications
3. Parameters
3.1. schema_path (string)
4. Functions
5. Installation
6. Table schema
7. Limitations
List of Examples
1.1. Set schema_path parameter
Chapter 1. Admin Guide
Table of Contents
1. Overview
2. Dependencies
2.1. SIP Router Modules
2.2. External Libraries or Applications
3. Parameters
3.1. schema_path (string)
4. Functions
5. Installation
6. Table schema
7. Limitations
1. Overview
Db_cassandra is one of the SIP Router database modules. It does not
export any functions executable from the configuration scripts, but it
exports a subset of functions from the database API, and thus, other
modules can use it as a database driver, instead of, for example, the
Mysql module.
The storage backend is a Cassandra cluster and this module provides an
SQL interface to be used by other modules for storing and retrieving
data. Because Cassandra is a NoSQL distributed system, there are
limitations on the operations that can be performed. The limitations
concern the indexes on which queries are performed, as it is only
possible to have simple conditions (equality comparison only) and only
two indexing levels. These issues will be explained in an example
below.
Cassandra DB is especially suited for storing large data or data that
requires distribution, redundancy or replication. One usage example is
a distributed location system in a platform that has a cluster of SIP
Router servers, with more proxies and registration servers accessing
the same location database. This was actually the main use case we had
in mind when implementing this module. Please NOTE that it has only
been tested with the usrloc, auth_db and domain modules.
You can find a configuration file example for this usage in the module
- kamailio_cassa.cfg.
Because the module has to do the translation from SQL to Cassandra
NoSQL queries, the schemas for the tables must be known by the module.
You will find the schemas for location, subscriber and version tables
in utils/kamctl/dbcassandra directory. You have to provide the path to
the directory containing the table definitions by setting the module
parameter schema_path.
There is no need to configure a table metadata in Cassandra cluster.
You only need to define a keyspace with the name of the database and
for each table a column family inside that keyspace with the name of
the table. The comparator and validators should be either UTF8Type or
ASCIIType. Example:
...
create keyspace openser;
use openser;
create column family 'location' with comparator='UTF8Type' and
default_validation_class='UTF8Type' and key_validation_class='UTF8Type';
...
Special attention was given to performance in Cassandra. Therefore, the
implementation uses only the native row indexing in Cassandra and no
secondary indexes, because they are costly. Instead, we simulate a
secondary index by using the column names and putting information in
them, which is very efficient. Also, for deleting expired records, we
let Cassandra take care of this with its own mechanism (by setting the
TTL for columns).
2. Dependencies
2.1. SIP Router Modules
2.2. External Libraries or Applications
2.1. SIP Router Modules
The following modules must be loaded before this module:
* No dependencies on other SIP Router modules.
2.2. External Libraries or Applications
The following libraries or applications must be installed before
running SIP Router with this module loaded:
* Thrift library version 0.6.1 .
The implementation was tested with Cassandra version 1.0.1 .
3. Parameters
3.1. schema_path (string)
3.1. schema_path (string)
The directory where the files with the table schemas are located. This
directory has to contain the subdirectories corresponding to the
databases' names (name of the directory = name of the database). These
directories, in turn, contain the files with the table schemas. See the
schemas in utils/kamctl/dbcassandra directory.
Example 1.1. Set schema_path parameter
...
modparam("db_cassandra", "schema_path",
"/usr/local/sip-router/etc/kamctl/dbcassandra")
...
4. Functions
NONE
5. Installation
Because it dependes on an external library, the db_cassandra module is
not compiled and installed by default. You can use one of these
options:
* - edit the "Makefile" and remove "db_cassandra" from
"excluded_modules" list. Then follow the standard procedure to
install SIP Router: "make all; make install".
* - from command line, run: 'make all include_modules="db_cassandra";
make install include_modules="db_cassandra"'.
6. Table schema
The module must know the table schema for the tables that will be used.
You must configure the path to the schema directory by setting the
schema_path parameter.
A table schema document has the following structure:
* First row: the name and type of the columns in the form name(type)
separated by spaces. The possible types are: string, int, double
and timestamp.
Thetimestamp type has a special meaning. Only one column of this
type can be defined for a table, and it should contain the expiry
time for that record. If defined this value will be used to compute
the ttl for the columns and Cassandra will automatically delete the
columns when they expire. Because we want the ttl to have meaning
for the entire record, we must ensure that when the ttl is updated,
it is updated for all columns for that record. In other words, to
update the expiration time of a record, an insert operation must be
performed from the point of view of the db_cassandra module
("insert" in Cassandra means "replace if exists or insert new
record otherwise"). So, if you define a table with a timestamp
column, the update operations on that table that also update the
timestamp must update all columns. So, these update operations must
in fact be insert operations.
* Second row: the columns that form the row key separated by space.
* Third row: the columns that form the secondary key separated by
space.
Bellow you can see the schema for the location table:
...
callid(string) cflags(int) contact(string) cseq(int) expires(timestamp) flags
(int) last_modified(int) methods(int) path(string) q(double) received(string) so
cket(string) user_agent(string) username(string) ruid(string) instance(string) r
eg_id(int)
username
contact
...
Observe first that the row key is the username and the secondary index
is the contact. We have also defined a timestamp column - expires. In
this example, both the row key and the secondary index are defined by
only one column, but they can be formed out of more columns. You can
list them separated by space.
To understand why the schema looks like this, we must first see which
queries are performed on the location table. (The 'callid' condition
was ignored as it doesn't really have a well defined role in the SIP
RFC).
* When Invite received, lookup location: select where username='..'.
* When Register received, update registration: update where
username='..' and contact='..'.
So, the relation between these keys is the following:
* The unique key for a table is actually the combination of row key +
secondary key.
* A row defined by a row key will contain more records with different
secondary keys.
The timestamp column that leaves the Cassandra cluster to deal with
deleting expired record can be used only with a modification to the
usrloc module that replaces the update performed at re-registration
with an insert operation (so that all columns are updated). This
behavior can be enabled by setting a parameter in the usrloc module
db_update_as_insert:
...
modparam("usrloc", "db_update_as_insert", 1)
...
The alternative would have been to define an index on the expire column
and run a external job to periodically delete the expired records.
However, obviously, this would be more costly.
7. Limitations
The module can be used only when the queries use only one index, which
is also the unique key, or have two indexes that form the unique key
like in the usrloc usage.