$Revision$$Date$The Server Startup
The main function in file
main.c is the first function called upon server
startup. Its purpose is to initialize the server and enter main
loop. The server initialization will be described in the following
sections.
Particular initialization steps are described in order in which they
appear in main function.
Installation Of New Signal Handlers
The first step in the initialization process is the installation of
new signal handlers. We need our own signal handlers to be able to
do graceful shutdown, print server statistics and so on. There is
only one signal handler function which is function
sig_usr in file main.c.
The following signals are handled by the function: SIGINT, SIGPIPE,
SIGUSR1, SIGCHLD, SIGTERM, SIGHUP and SIGUSR2.
Processing Command Line Parameters
SER utilizes the getoptfunction to parse
command line parameters. The function is extensively described in
the man pages.
Parser Initialization
SER contains a fast 32-bit parser. The parser uses pre-calculated
hash table that needs to be filled in upon startup. The
initialization is done here, there are two functions that do the
job. Function init_hfname_parser initializes
hash table in header field name parser and function
init_digest_parser initializes hash table in
digest authentication parser. The parser's internals will be
described later.
Malloc Initialization
To make SER even faster we decided to re-implement memory
allocation routines. The new malloc better
fits our needs and speeds up the server a lot. The memory
management subsystem needs to be initialized upon server
startup. The initialization mainly creates internal data structures
and allocates memory region to be partitioned.
The memory allocation code must be initialized
BEFORE any of its function is called !
Timer Initialization
Various subsystems of the server must be called periodically
regardless of the incoming requests. That's what timer is
for. Function init_timer initializes the timer
subsystem. The function is called from main.c
and can be found in timer.c The timer
subsystem will be described later.
Timer subsystem must be initialized before config file is parsed !
FIFO Initialization
SER has built-in support for FIFO control. It means that the
running server can accept commands over a FIFO special file (a
named pipe). Function register_core_fifo
initializes FIFO subsystem and registers basic commands, that are
processed by the core itself. The function can be found in file
fifo_server.c.
The FIFO server will be described in another chapter.
Built-in Module Initialization
Modules can be either loaded dynamically at runtime or compiled in statically. When a module
is loaded at runtime, it is registered
Module registration is a process when the core tries to find what functions and
parameters are offered by the module.
immediately with the core. When the module is compiled in
statically, the registration must be
performed during the server startup. Function
register_builtin_modules does the job.
Server Configuration
The server is configured through a configuration file. The
configuration file is C-Shell like script which defines how
incoming requests should be processed. The file cannot be
interpreted directly because that would be very slow. Instead of
that the file is translated into an internal binary
representation. The process is called compilation and will be
described in the following sections.
The following sections only describe how the internal binary
representation is being constructed from the config file. The
way how the binary representation is used upon a request
arrival will be described later.
The compilation can be divided in several steps:Lexical Analysis
Lexical analysis is process of converting the input (the
configuration file in this case) into a stream of tokens. A
token is a set of characters that 'belong' together. A program
that can turn the input into stream of tokens is called
scanner. For example, when scanner encounters a number in the
config file, it will produce token NUMBER.
There is no need to implement the scanner from scratch, it can
be done automatically. There is a utility called flex. Flex
accepts a configuration file and generates scanner according to
the configuration file. The configuration file for flex
consists of several lines - each line describing one token. The
tokens are described using regular expressions. For more
details, see flex manual page or info documentation.
Flex input file for the SER config file is in file
cfg.lex. The file is processed by flex
when the server is being compiled and the result is written in
file lex.yy.c. The output file contains
the scanner implemented in the C language.
Syntactical Analysis
The second stage of configuration file processing is called
syntactical analysis. Purpose of syntactical analysis is to
check if the configuration file has been well formed, doesn't
contain syntactical errors and perform various actions at
various stages of the analysis. Program performing syntactical
analysis is called parser.
Structure of the configuration file is described using
grammar. Grammar is a set of rules describing valid 'order' or
'combination' of tokens. If the file isn't conformable with
its grammar, it is syntactically invalid and cannot be further
processed. In that case an error will be issued and the server
will be aborted.
There is a utility called yacc. Input of the utility is a file
containing the grammar of the configuration file, in addition
to the grammar, you can describe what action the parser should
do at various stages of parsing. For example, you can instruct
the parser to create a structure describing an IP address every
time it finds an IP address in the configuration file and
convert the address to its binary representation.
For more information see yacc documentation.
yacc creates the parser when the server is being compiled from
the sources. Input file for yacc is
cfg.y. The file contains grammar of the
config file along with actions that create the binary
representation of the file. Yacc will write its result into
file cfg.tab.c. The file contains function
yyparse which will parse the whole
configuration file and construct the binary representation. For
more information about the bison input file syntax see bison
documentation.
Config File Structure
The configuration file consist of three sections, each of the
sections will be described separately.
Route Statement - The statement
describes how incoming requests will be processed.
When a request is received, commands in one or more
"route" sections will be executed step by step. The
config file must always contain one main "route"
statement and may contain several additional "route"
statements. Request processing always starts at the
beginning of the main "route" statement. Additional
"route" statements can be called from the main one or
another additional "route" statements (It it similar to
function calling).
Assign Statement - There are many
configuration variables across the server and this
statement makes it possible to change their
value. Generally it is a list of assignments, each
assignment on a separate line.
Module Statement - Additional
functionality of the server is available through
separate modules. Each module is a shared object that
can be loaded at runtime. Modules can export functions,
that can be called from the configuration file and
variables, that can be configured from the config
file. The module statement makes it possible to load
modules and configure them. There are two commands in
the statement - loadmodule and
modparam. The first can load a
module. The second one can configure module's internal
variables.
In the following sections we will describe in detail how the
three sections are being processed upon server startup.
Route StatementThe following grammar snippet describes how the route statement is constructed
route_stm = "route" "{" actions "}"
{
$$ = push($3, &rlist[DEFAULT_RT]);
}
actions = actions action { $$ = append_action($1, $2}; }
| action { $$ = $1; }
action = cmd SEMICOLON { $$ = $1; }
| SEMICOLON { $$ = 0; }
cmd = "forward" "(" host ")" { $$ = mk_action(FORWARD_T, STRING_ST, NUMBER_ST, $3, 0)
| ...
A config file can contain one or more "route"
statements. "route" statement without number will be
executed first and is called the main route
statement. There can be additional route statements
identified by number, these additional route statements can
be called from the main route statement or another
additional route statements.
Each route statement consists of a set of actions. Actions
in the route statement are executed step by step in the
same order in which they appear in the config file. Actions
in the route statement are delimited by semicolon.
Each action consists of one and only one command (cmd in
the grammar). There are many types of commands defined. We
don't list all of them here because the list would be too
long and all the commands are processed in the same
way. Therefore we show only one example (forward) and
interested readers might look in cfg.y
file for full list of available commands.
Each rule in the grammar contains a section enclosed in
curly braces. The section is the C code snippet that will
be executed every time the parser recognizes that rule in
the config file.
For example, when the parser finds
forward command,
mk_action function (as specified in
the grammar snippet above) will be called. The function
creates a new structure with
type field set to FORWARD_T
representing the command. Pointer to the structure will be
returned as the return value of the rule.
The pointer propagates through action
rule to actions
rule. Actions rule will create linked
list of all commands. The linked list will be then inserted
into rlist table. (Function
push in rule
route_stm). Each element of the table
represents one "route" statement of the config file.
Each route statement of the configuration file will be
represented by a linked list of all actions in the
statement. Pointers to all the lists will be stored in
rlist array. Additional route statements are identified by
number. The number also serves as index to the array.
When the core is about to execute route statement with
number n, it will look in the array at position n. If the
element at position n is not null then there is a linked
list of commands and the commands will be executed step by
step.
Reply-Route statement is compiled in the same way. Main differences are:
Reply-Route statement is executed when a SIP
REPLY comes (not ,SIP
REQUEST).
Only subset of commands is allowed in the
reply-route statement. (See file
cfg.y for more details).
Reply-route statement has its own array of linked-lists.Assign Statement
The server contains many configuration variables. There is
a section of the config file in which the variables can be
assigned new value. The section is called The Assign
Statement. The following grammar snippet describes how the
section is constructed (only one example will be shown):
assign_stm = "children" '=' NUMBER { children_no=$3; }
| "children" '=' error { yyerror("number expected"); }
...
The number in the config file is assigned to
children_no variable. The second
statement will be executed if the parameter is not number
or is in invalid format and will issue an error and abort
the server.
Module Statement
The module statement allows module loading and
configuration. There are two commands:
loadmodule - Load the
specified module in form of a shared object. The
shared object will be loaded using
dlopen.
modparam - It is possible to
configure a module using this command. The command
accepts 3 parameters: module
name, variable name
and variable value.
The following grammar snippet describes the module statement:
module_stm = "loadmodule" STRING
{
DBG("loading module %s\n", $2);
if (load_module($2)!=0) {
yyerror("failed to load module");
}
}
| "loadmodule" error { yyerror("string expected"); }
| "modparam" "(" STRING "," STRING "," STRING ")"
{
if (set_mod_param($3, $5, PARAM_STR|PARAM_STRING, $7) != 0) {
yyerror("Can't set module parameter");
}
}
| "modparam" "(" STRING "," STRING "," NUMBER ")"
{
if (set_mod_param($3, $5, PARAM_INT, (void*)$7) != 0) {
yyerror("Can't set module parameter");
}
}
| MODPARAM error { yyerror("Invalid arguments"); }
When the parser finds loadmodule
command, it will execute statement in curly braces. The
statement will call load_module
function. The function will load the specified filename
using dlopen. If
dlopen was successful, the server will
look for exports structure
describing the module's interface and register the
module. For more details see module section.
If the parser finds modparam command,
it will try to configure the specified variable in the
specified module. The module must be loaded using
loadmodule before
modparam for the module can be used !
Function set_mod_param will be called
and will configure the variable in the specified module.
Interface Configuration
The server will try to obtain list of all configured interfaces of
the host it is running on. If it fails the server tries to convert
hostname to IP address and will use interface with the IP address
only.
Function add_interfaces will add all
configured interfaces to the array.
Try to convert all interface names to IP addresses, remove duplicates...
Turning into a Daemon
When configured so, SER becomes a daemon during startup. A process
is called daemon when it hasn't associated controlling
terminal. See function daemonize in file
main.c for more details. The function does
the following:
chroot is performed if necessary. That
ensures that the server will have access to a particular
directory and its subdirectories only.
Server's working directory is changed if the new working
directory was specified (usually it is /).
If command line parameter -g was used, the server's group
ID is changed to that value.
If command line parameter -u was used, the server's user ID
is changed to that value.
Perform fork, let the parent process
exit. This ensures that we are not a group leader.
Perform setsid to become a session
leader and drop the controlling terminal.
Fork again to drop group leadership.
Create a pid file.Close all opened file descriptors.Module Initialization
The whole config file was parsed, all modules were loaded already
and can be initialized now. A module can tell the core that it
needs to be initialized by exporting mod_init
function. mod_init function of all loaded
modules will be called now.
Routing List Fixing
After the whole routing list was parsed, there might be still
places that can be further processed to speed-up the server. For
example, several commands accept regular expression as one of their
parameters. The regular expression can be compiled too and
processing of compiled expression will be much faster.
Another example might be string as parameter of a function. For
example if you call append_hf("Server: SIP Express
Router\r\n") from the routing script, the function will
append a new header field after the last one. In this case, the
function needs to know length of the string parameter. It could
call strlen every time it is called, but that
is not a very good idea because strlen would
be called every time a message is processed and that is not
necessary.
Instead of that the length of the string parameter could be
pre-calculated upon server startup, saved and reused later. The
processing of the request will be faster because
append_hf doesn't need to call
strlen every time, I can just reuse the saved
value.
This can be used also for string to int conversions, hostname
lookups, expression evaluation and so on.
This process is called Routing List Fixing and will be done as one
of last steps of the server startup.
Every loaded module can export one or more functions. Each such
function can have associated a fixup function, which should do
fixing as described in this section. All such fixups of all loaded
modules will be called here. That makes it possible for module
functions to fix their parameters too if necessary.
Statistics Initialization
If compiled-in, the core can produce some statistics about itself
and traffic processed. The statistics subsystem gets initialized
here, see function init_stats.
Socket Initialization
UDP socket initialization depends on dont_fork
variable. If this variable is set (only one process will be
processing incoming requests) and there are multiple listen
interfaces, only the first one will be used. This mode is mainly
for debugging.
If the variable is not set, then sockets for all configured
interfaces will be created and initialized. See function
udp_init in file
udp_server.c for more details.
Forking
The rest of the initialization process depends on value of
dont_fork variable.
dont_fork is a global variable defined in
main.c. We will describe both variants
separately.
dont_fork variable is set (not zero)
If dont_fork variable is set, the server
will be operating in special mode. There will be only one
process processing incoming requests. This is very slow and was
intended mainly for debugging purposes. The main process will
be processing all incoming requests itself.
The server still needs additional children:
One child is for the timer subsystem, the child will be
processing timers independently of the main process.
FIFO server will spawn another child if enabled. The
child will be processing all commands coming through
the fifo interface.
If SNMP support was enabled, another child will be created.
The following initialization will be performed in dont_fork
mode. (look into function main_loop in
file main.c.
Another child will be forked for the timer subsystem.
Initialize the FIFO server if enabled, this will fork
another child. For more info about the FIFO server,
see section The FIFO server.
Call init_child(0). The function
performs per-child specific initialization of all
loaded modules. A module can be initialized though
mod_init function. The function
is called BEFORE the server forks
and thus is common for all children.
If there is anything, that needs to be initialized in
every child separately (for example if each child needs
to open its own file descriptor), it cannot be done in
mod_init. To make such
initialization possible, a module can export another
initialization function called
init_child. The function will be
called in all children AFTER fork
of the server.
And since we are in "dont fork" mode and there will no
children processing requests (remember the main process
will be processing all requests), the
init_child wouldn't be called.
That would be bad, because
child_init might do some
initialization that must be done otherwise modules
might not work properly.
To make sure that module initialization is complete we
will call init_child here for the
main process even if we are not going to fork.
That's it. Everything has been initialized properly and as the
last step we will call udp_rcv_loop which
is the main loop function. The function will be described
later.
dont_fork is not set (zero)dont_fork is not set. That means that the
server will fork children and the children will be processing
incoming requests. How many children will be created depends on
the configuration (children variable). The
main process will be sleeping and handling signals only.
The main process will then initialize the FIFO server. The FIFO
server needs another child to handle communication over FIFO
and thus another child will be created. The FIFO server will be
described in more detail later.
Then the main process will perform another fork for the timer
attendant. The child will take care of timer lists and execute
specified function when a timer hits.
The main process is now completely initialized, it will sleep
in pause function until a signal comes and
call handle_sigs when such condition
occurs.
The following initialization will be performed by each child
separately:
Each child executes init_child
function. The function will sequentially call
child_init functions of all loaded
modules.
Because the function is called in each child separately, it can
initialize per-child specific data. For example if a module
needs to communicate with database, it must open a database
connection. If the connection would be opened in
mod_init function, all the children would
share the same connection and locking would be necessary to
avoid conflicts. On the other hand if the connection was opened
in child_init function, each child will
have its own connection and concurrency conflicts will be
handled by the database server.
And last, but not least, each child executes
udp_rcv_loop function which contains the
main loop logic.