This new module describes an API that can be thought of as a combination of
stasis topic pools, and caching. Except, hopefully done in a more efficient
and less memory "leaky" manner.
The API defines methods, and data structures for managing, and tracking
published message state through stasis. By adding a subscriber or publisher,
consumers can more easily track the lifetime of the contained state. For
instance, when no more publishers and/or subscribers have need of the topic,
and associated state its data is removed from the managed container.
* stasis_state_manager *
The manager stores and well, manages state data. Each state is an association
of a unique stasis topic, and the last known published stasis message on that
topic. There is only ever one managed state object per topic. For each topic
all messages are forwarded to an "all" topic also maintained by the manager.
* stasis_state_subscriber *
Topic and state can be created, or referenced within the manager by adding a
stasis_state_subscriber. When adding a subscriber if no state currently exists
new managed state is immediately created. If managed state already exists then
a new subscriber is created referencing that state. The managed state is
guaranteed to live throughout the subscriber's lifetime. State is only removed
from the manager when no other entities require it.
* stasis_state_publisher *
Topic and state can be created, or referenced within the manager by also adding
a stasis_state_publisher. When adding a publisher if no state currently exists
new managed state is created. If managed state already exists then a new
publisher is created referencing that state. The managed state is guaranteed to
live throughout the publisher's lifetime. State is only removed from the
manager when no other entities require it.
* stasis_state_observer *
Some modules may wish to watch for, and react to managed state events. By
registering a state observer, and implementing handlers for the desired
callbacks those modules can do so.
* other *
Callbacks also exist that allow consumers to iterate over all, or some of the
managed state.
ASTERISK-28442
Change-Id: I7a4a06685a96e511da9f5bd23f9601642d7bd8e5
We were using the presence of /usr/lib64 to determine where
shared libraries should be installed. This only existed on
Redhat based systems and was safe. If it existed, use it,
otherwise use /usr/lib.
Unfortunately, Ubuntu 19 decided to create a /usr/lib64 BUT
NOT INCLUDE IT IN THE DEFAULT ld.so.conf. So if anything is
installed there, it won't work.
The new method, just looks for $ID in /etc/os-release and if it's
centos or fedora, uses /usr/lib64 and if ubuntu, uses /usr/lib.
NOTE: This applies only to the CI scripts. Normal asterisk
build and install is not affected.
Change-Id: Iad66374b550fd89349bedbbf2b93f8edd195a7c3
This patch adds basic Asterisk channel statistics to the res_prometheus
module. This includes:
* asterisk_calls_sum: A running sum of the total number of
processed calls
* asterisk_calls_count: The current number of calls
* asterisk_channels_count: The current number of channels
* asterisk_channels_state: The state of any particular channel
* asterisk_channels_duration_seconds: How long a channel has existed,
in seconds
In all cases, enough information is provided with each channel metric
to determine a unique instance of Asterisk that provided the data, as
well as the name, type, unique ID, and - if present - linked ID of each
channel.
ASTERISK-28403
Change-Id: I0db306ec94205d4f58d1e7fbabfe04b185869f59
Prometheus is the defacto monitoring tool for containerized applications.
This patch adds native support to Asterisk for serving up Prometheus
compatible metrics, such that a Prometheus server can scrape an Asterisk
instance in the same fashion as it does other HTTP services.
The core module in this patch provides an API that future work can build
on top of. The API manages metrics in one of two ways:
(1) Registered metrics. In this particular case, the API assumes that
the metric (either allocated on the stack or on the heap) will have
its value updated by the module registering it at will, and not
just when Prometheus scrapes Asterisk. When a scrape does occur,
the metrics are locked so that the current value can be retrieved.
(2) Scrape callbacks. In this case, the API allows consumers to be
called via a callback function when a Prometheus initiated scrape
occurs. The consumers of the API are responsible for populating
the response to Prometheus themselves, typically using stack
allocated metrics that are then formatted properly into strings
via this module's convenience functions.
These two mechanisms balance the different ways in which information is
generated within Asterisk: some information is generated in a fashion
that makes it appropriate to update the relevant metrics immediately;
some information is better to defer until a Prometheus server asks for
it.
Note that some care has been taken in how metrics are defined to
minimize the impact on performance. Prometheus's metric definition
and its support for nesting metrics based on labels - which are
effectively key/value pairs - can make storage and managing of metrics
somewhat tricky. While a naive approach, where we allow for any number
of labels and perform a lot of heap allocations to manage the information,
would absolutely have worked, this patch instead opts to try to place
as much information in length limited arrays, stack allocations, and
vectors to minimize the performance impacts of scrapes. The author of
this patch has worked on enough systems that were driven to their knees
by poor monitoring implementations to be a bit cautious.
Additionally, this patch only adds support for gauges and counters.
Additional work to add summaries, histograms, and other Prometheus
metric types may add value in the future. This would be of particular
interest if someone wanted to track SIP response types.
Finally, this patch includes unit tests for the core APIs.
ASTERISK-28403
Change-Id: I891433a272c92fd11c705a2c36d65479a415ec42
Added a conversion for umax (largest maximum sized integer allowed). Adjusted
the other current conversion functions (uint and ulong) to be derivatives of
the umax conversion since they are simply subsets of umax.
Also made the negative check move the pointer on spaces since strtoumax does it
anyways.
Change-Id: I56c2ef2629d49b524c8df58af12951c181f81f08
One of the downaides of having things like test configuration
in the git repo is that it can't be changed at runtime. You have
to create a review for the changes and merge it mefore it will
take effect.
This review moves the data currently held in
tests/CI/periodic-dailyTestGroups.json and
tests/CI/gateTestGroups.json into a Jenkins Config File attached
to the job definitions. This allows us to alter it from the
Jenkins UI at runtime. The original files stay in the repo
as documentation.
Change-Id: I14b9702f6285ce1fb2420287ba0e7d3b59109763
The new option disables dev mode, TEST_FRAMEWORK and
MALLOC_DEBUG making the build more production-like.
Change-Id: Ieb72497d4d91d5416684aaed702cc3f532099738
It was difficult to check the channel's current application and
parameters using ARI for current channels. Added app_name, app_data
items to show the current application information.
ASTERISK-28343
Change-Id: Ia48972b3850e5099deab0faeaaf51223a1f2f38c
Added ability to specifiy a wizard is read-only when applying
it to a specific object type. This allows you to specify
create, update and delete callbacks for the wizard but limit
which object types can use them.
Added the ability to allow an object type to have multiple
wizards of the same type. This is indicated when a wizard
is added to a specific object type.
Added 3 new sorcery wizard functions:
* ast_sorcery_object_type_insert_wizard which does the same thing
as the existing ast_sorcery_insert_wizard_mapping function but
accepts the new read-only and allot-duplicates flags and also
returns the ast_sorcery_wizard structure used and it's internal
data structure. This allows immediate use of the wizard's
callbacks without having to register a "wizard mapped" observer.
* ast_sorcery_object_type_apply_wizard which does the same
thing as the existing ast_sorcery_apply_wizard_mapping function
but has the added capabilities of
ast_sorcery_object_type_insert_wizard.
* ast_sorcery_object_type_remove_wizard which removes a wizard
matching both its name and its original argument string.
* The original logic in __ast_sorcery_insert_wizard_mapping was moved
to __ast_sorcery_object_type_insert_wizard and enhanced for the
new capabilities, then __ast_sorcery_insert_wizard_mapping was
refactored to just call __ast_sorcery_insert_wizard_mapping.
* Added a unit test to test_sorcery.c to test the read-only
capability.
Change-Id: I40f35840252e4313d99e11dbd80e270a3aa10605
Changed to requirement to having timestamp for all of ARI events.
The below ARI events were changed to having timestamp.
PlaybackStarted, PlaybackContinuing, PlaybackFinished,
RecordingStarted, RecordingFinished, RecordingFailed,
ApplicationReplaced, ApplicationMoveFailed
ASTERISK-28326
Change-Id: I382c2fef58f5fe107e1074869a6d05310accb41f
The recent upgrade of Gerrit to 2.16 elimiated referencing a
repository in a way the jenkinsfiles were relying on so
the URL references were changed to a more consistent and supported
format.
Change-Id: I2e8e3f213b9a96bb1b27665eca4a9a24bc49820e
(cherry picked from commit 5ce084579f)
To prevent one subsystem's taskprocessors from causing others
to stall, new capabilities have been added to taskprocessors.
* Any taskprocessor name that has a '/' will have the part
before the '/' saved as its "subsystem".
Examples:
"sorcery/acl-0000006a" and "sorcery/aor-00000019"
will be grouped to subsystem "sorcery".
"pjsip/distributor-00000025" and "pjsip/distributor-00000026"
will bn grouped to subsystem "pjsip".
Taskprocessors with no '/' have an empty subsystem.
* When a taskprocessor enters high-water alert status and it
has a non-empty subsystem, the subsystem alert count will
be incremented.
* When a taskprocessor leaves high-water alert status and it
has a non-empty subsystem, the subsystem alert count will be
decremented.
* A new api ast_taskprocessor_get_subsystem_alert() has been
added that returns the number of taskprocessors in alert for
the subsystem.
* A new CLI command "core show taskprocessor alerted subsystems"
has been added.
* A new unit test was addded.
REMINDER: The taskprocessor code itself doesn't take any action
based on high-water alerts or overloading. It's up to taskprocessor
users to check and take action themselves. Currently only the pjsip
distributor does this.
* A new pjsip/global option "taskprocessor_overload_trigger"
has been added that allows the user to select the trigger
mechanism the distributor uses to pause accepting new requests.
"none": Don't pause on any overload condition.
"global": Pause on ANY taskprocessor overload (the default and
current behavior)
"pjsip_only": Pause only on pjsip taskprocessor overloads.
* The core pjsip pool was renamed from "SIP" to "pjsip" so it can
be properly grouped into the "pjsip" subsystem.
* stasis taskprocessor names were changed to "stasis" as the
subsystem.
* Sorcery core taskprocessor names were changed to "sorcery" to
match the object taskprocessors.
Change-Id: I8c19068bb2fc26610a9f0b8624bdf577a04fcd56
Some tests require Asterisk to execute scripts which
are stored in /tmp. When mount is used for tmpfs there
is no ability to allow scripts to be executed from
that location.
This change switches to using tmpfs which can be told
to allow executables to be run from /tmp.
Change-Id: I0e598ca2b76af1f7f2d29f0da7b1731a214a291a
This change makes it so that even if non-code changes
occur (such as commit message changing) unit tests
will still be run and result in a verification.
ASTERISK-28251
Change-Id: I6491fff7c93e5d5cd8e41054486968bf66c4f608
A subscriber can now indicate that it only wants messages
that have formatters of a specific type. For instance,
manager can indicate that it only wants messages that have a
"to_ami" formatter. You can combine this with the existing
filter for message type to get only messages with specific
formatters or messages of specific types.
ASTERISK-28186
Change-Id: Ifdb7a222a73b6b56c6bb9e4ee93dc8a394a5494c
* Added ---no-configure, --no-menuselect, --no-make and --no-alembic
options that prevent those actions from being performed. Useful
for testing and re-running portions of the build after fixing
earlier failures.
* Added "set -e" to abort the script on command failure.
Not sure why this wasn't there in the first place.
* Fixed a few echos that were redirecting to stderr when they shouldn't
have been.
* Catch more alembic failures by actually trying to generate the SQL.
Change-Id: I9f395fa4e9254be7299e7c1014f1a13db78faffb
This test was occasionally failing, with:
WARNING[5812]: http.c:1939 httpd_helper_thread: Failed to set
TCP_NODELAY on HTTP connection: Bad file descriptor
ERROR[5812]: iostream.c:91 ast_iostream_nonblock: Failed to get
fcntl() flags for file descriptor: Bad file descriptor
ERROR[5812]: iostream.c:569 ast_iostream_close: close() failed: Bad
file descriptor
Disabled for now by making the test explicit only.
Change-Id: I778f6cbb6104c6b4e89737a2eaf1a9540888d351
These are only a few of the leaks. The large number of macros
and return paths in this file would make a weeks worth of work
to plug them all.
Change-Id: Ie2369fa944023d44767871c5c30974cb077ffb56
* The bridging core no longer uses the stasis cache for bridge
snapshots. The latest bridge snapshot is now stored on the
ast_bridge structure itself.
* The following APIs are no longer available since the stasis cache
is no longer used:
ast_bridge_topic_cached()
ast_bridge_topic_all_cached()
* A topic pool is now used for individual bridge topics.
* The ast_bridge_cache() function was removed since there's no
longer a separate container of snapshots.
* A new function "ast_bridges()" was created to retrieve the
container of all bridges. Users formerly calling
ast_bridge_cache() can use the new function to iterate over
bridges and retrieve the latest snapshot directly from the
bridge.
* The ast_bridge_snapshot_get_latest() function was renamed to
ast_bridge_get_snapshot_by_uniqueid().
* A new function "ast_bridge_get_snapshot()" was created to retrieve
the bridge snapshot directly from the bridge structure.
* The ast_bridge_topic_all() function now returns a normal topic
not a cached one so you can't use stasis cache functions on it
either.
* The ast_bridge_snapshot_type() stasis message now has the
ast_bridge_snapshot_update structure as it's data. It contains
the last snapshot and the new one.
* cdr, cel, manager and ari have been updated to use the new
arrangement.
Change-Id: I7049b80efa88676ce5c4666f818fa18ad1985369
When a channel snapshot was created it used to be done
from scratch, copying all data (many strings). This incurs
a cost when doing so.
This change segments the channel snapshot into different
components which can be reused if unchanged from the
previous snapshot creation, reducing the cost. In normal
cases this results in some pointers being copied with
reference count being bumped, some integers being set,
and a string or two copied. The other benefit is that it
is now possible to determine if a channel snapshot update
is redundant and thus stop it before a message is published
to stasis.
The specific segments in the channel snapshot were split up
based on whether they are changed together, how often they
are changed, and their general grouping. In practice only
1 (or 0) of the segments actually get changed in normal
operation.
Invalidation is done by setting a flag on the channel when
the segment source is changed, forcing creation of a new
segment when the channel snapshot is created.
ASTERISK-28119
Change-Id: I5d7ef3df963a88ac47bc187d73c5225c315f8423
Channels no longer use the Stasis cache for channel snapshots. Instead
they are stored in a hash table in stasis_channels which reduces the
number of Stasis messages created and allows better storage.
As a result the following APIs are no longer available since the stasis
cache is no longer used:
ast_channel_topic_cached()
ast_channel_topic_all_cached()
The ast_channel_cache_all() and ast_channel_cache_by_name() functions
now return an ao2_container of ast_channel_snapshots rather than
a container of stasis_messages therefore you can't (and don't need
to) call stasis_cache functions on it.
The ast_channel_topic_all() function now returns a normal topic not
a cached one so you can't use stasis cache functions on it either.
The ast_channel_snapshot_type() stasis message now has the
ast_channel_snapshot_update structure as it's data. It contains the
last snapshot and the new one.
ast_channel_snapshot_get_latest() still returns the latest snapshot.
The latest snapshot is now stored on the channel itself to eliminate
cache hits when Stasis messages that have the snapshot as a payload
are created.
ASTERISK-28102
Change-Id: I9334febff60a82d7c39703e49059fa3a68825786
Replace usage of ao2_container_alloc with ao2_container_alloc_hash or
ao2_container_alloc_list. Remove ao2_container_alloc macro.
Change-Id: I0907d78bc66efc775672df37c8faad00f2f6c088
Create ao2_container_dup_weakproxy_objs to perform a similar function to
ao2_container_dup. This function expects the source container to have
weakproxy objects, inserts the associated non-weak objects into the
destination container. Orphaned weakproxy objects are ignored.
Create test for this new function and for ao2_weakproxy_find.
Change-Id: I898387f058057e08696fe9070f8cd94ef3a27482
The job timeouts were hard coded in the jenkinsfiles which
means changes had to go through gerrit. Now they are taken
from the following environment variables (and their defaults) that
can be set in Jenkins configuration...
TIMEOUT_GATES = "60 MINUTES"
TIMEOUT_DAILIES = "3 HOURS"
TIMEOUT_REF_DEBUG = "24 HOURS"
TIMEOUT_UNITTESTS = "30 MINUTES"
Change-Id: I673a551c1780bf665a3bc160b245da574aa4bbab
We've been seeing crashes in libbfd when we attempt to generate
a stack trace from multiple threads. It turns out that libbfd
is NOT thread-safe. It can cache the bfd structure and give it to
multiple threads without protecting itself. To get around this,
we've added a global mutex around the bfd functions and also have
refactored the use of those functions to be more efficient and
to provide more information about inlined functions.
Also added a few more tests to test_pbx.c. One just calls
ast_assert() and the other calls ast_log_backtrace(). Neither are
run by default.
WARNING: This change necessitated changing the return value of
ast_bt_get_symbols() from an array of strings to a VECTOR of
strings. However, the use of this function outside Asterisk is not
likely.
ASTERISK-28140
Change-Id: I79d02862ddaa2423a0809caa4b3b85c128131621
The testsuite can now use a user-specified work directory for
all it's temp files. This allows the docker containers to use
a tmpfs backed directory for the temp files instead of it's
own write-layer image.
* runTestsuite.sh now accepts a --work-dir command line argument
that gets exported as AST_WORK_DIR before running the testsuite.
* gates.jenkinsfile now specifies --work-dir to be
<testsuite_dir>/astroot.
Since the Asterisk CI docker hosts now mount /srv/jenkins/workspace
on a tmpfs, asterisk should be compiled and the testsuite run all in
memory.
Change-Id: If5ee905a15821296c355bb84cda38950ad8edc45
(cherry picked from commit a335f4c9ad)
There seems to be a race condition between starting the asterisk
daemon and attempting to use 'asterisk -r' that can cause the
control socket file to not be created. Since all of the Jenkins
slaves have 'expect' installed, the runUnittests script can use
it to start asterisk in the forground and issue the commands
interactively. This is much more reliable and it can also make
startup errors more visible since they'll be in the Jenkins console
output.
If 'expect' isn't installed, the original daemon/asterisk -r
process is used.
Also added a "core show settings" before running the tests
and added "notice,warning,error" to the console log.
Change-Id: Idd656085f854afede813ac241b9e312b31358160
It's possible for a 4th task to be spawned before we cancel. This
results in a write to the already freed test_data1. Wait long enough to
verify success of the cancelation before freeing test_data1.
Change-Id: I057e2fcbe97f8a175e50890be89c28c20490a20f
These macros have been documented as legacy for a long time but are
still used in new code because they exist. Remove all references to:
* ao2_container_alloc_options
* ao2_t_container_alloc_options
* ao2_t_container_alloc
These macro's are also removed. Only ao2_container_alloc remains due to
it's use in over 100 places.
Change-Id: I1a26258b5bf3deb081aaeed11a0baa175c933c7a
Add attribute_warn_unused_result to ast_taskprocessor_push,
ast_taskprocessor_push_local and ast_threadpool_push. This will help
ensure we perform the necessary cleanup upon failure.
Change-Id: I7e4079bd7b21cfe52fb431ea79e41314520c3f6d
Fix redirection to /dev/null of cleanup commands. The '2' was being
interpreted as part of the command instead of part of the redirect.
Change-Id: I2e3a591b165e0288c4b82b9ef475fdfd5392a90a
Can't do anonymous http checkout from Security-testsuite.
Need to use same credentials as the gerrit review checkout.
Change-Id: I87af68c995cb8926f5e87f9af245600d76984f05
As they're not actively used, they only grow stale. The moduleinfo field itself
is kept in Asterisk 13/15 for ABI compatibility.
ASTERISK-28046 #close
Change-Id: I8df66a7007f807840414bb348511a8c14c05a9fc
This new option can be passed for ./configure or
./tests/CI/buildAsterisk.sh to prevent download/install of binary
modules.
Normally enabling the categories MENUSELECT_CODECS or MENUSELECT_RES
will result in binary modules being enabled even if the build target is
incompatible with those modules. This includes CI scripts which enable
categories before disabling specific modules.
If more binary modules are offered in the future this will help avoid
accidentally downloading them if unwanted or incompatible. Adding a
binary module will only require creating a new menuselect entry similar
to the existing ones, it will not be necessary to modify the CI scripts.
Change-Id: I6b1bd1c75a2e48f05b8b8a45b7a7a2d00a079166
If the review to be tested is in a project with restricted access,
we need to use the jenkins user's gerrit https credentials when we
do the checkout or the checkout will fail.
Change-Id: I9dc9994763c5ebfeb9f1cff60fb53f6902b7fd5f
Enable coverage with `./tests/CI/buildAsterisk.sh --coverage`. This
will cause Asterisk to be compiled with coverage support. It also
initializes 'before' coverage data for all sources. Accept
--tested-only to disable modules which are not run by any test.
Enabling coverage also sets tested-only true by default. To build
everything with coverage enabled use `--coverage --tested-only=0`.
./tests/CI/processCoverage.sh is used to process the coverage and
generate HTML reports.
Fix utils/check_expr2 which failed to compiled with coverage enabled.
Add status output 5 times per stage of astobj2_test_perf to ensure
remote CLI does not timeout when compiled with coverage. Remote CLI
disconnects if no output is received for 60 seconds. When coverage is
enabled it takes about 70 seconds for my laptop to run the stages of
this test, so with the change a message is printed every 14 seconds.
Change-Id: I890f7d5665087426ad7d3e363187691b9afc2222
Changing any Menuselect option in the `Compiler Flags` section causes a
full rebuild of the Asterisk source tree. Every enabled option causes
a #define to be added to buildopts.h, thus breaking ccache caching for
every source file that includes "asterisk.h". In most cases each option
only applies to one or two files. Now we only define those options for
the specific sources which use them, this causes much better cache
matching when working with multiple builds. For example testing code
with an without MALLOC_DEBUG will now use just over half the ccache
size, only main/astmm.o will have two builds cached instead of every
file.
Reorder main/Makefile so _ASTCFLAGS set on specific object files are all
together, sorted by filename. Stop adding -DMALLOC_DEBUG to CFLAGS of
bundled pjproject, this define is no longer used by any header so only
serves to break cache.
The only code change is a slight adjustment to how main/astmm.c is
initialized. Initialization functions always exist so main/asterisk.c
can call them unconditionally. Additionally rename the astmm
initialization functions so they are not exported.
Change-Id: Ie2085237a964f6e1e6fff55ed046e2afff83c027
The --test-command argument has now been split, unit tests now use
`--unittest-command` and the testsuite uses --testsuite-command.
This will make it easier to create a script which run everything by
forwarding the same arguments to all CI scripts.
Change-Id: Ia54aa4848eaffbdf13175fcda40fc0b23080ad71
Support has been added for receiving a NACK request and handling it.
Now, Asterisk can detect when a NACK request should be sent and knows
how to construct one based on the packets we've received from the remote
end. A buffer has been added that will store out of order packets until
we receive the packet we are expecting. Then, these packets are handled
like normal and frames are queued to the core like normal. Asterisk
knows which packets to request in the NACK request using a vector
which stores the sequence numbers of the packets we are currently missing.
If a missing packet is received, cycle through the buffer until we reach
another packet we have not received yet. If the buffer reaches a certain
size, send a NACK request. If the buffer reaches its max size, queue all
frames to the core and wipe the buffer and vector.
According to RFC3711, the NACK request must be sent out in a compound
packet. All compound packets must start with a sender or receiver
report, so some work was done to refactor the current sender / receiver
code to allow it to be used without having to also include sdes
information and automatically send the report.
Also added additional functionality to ast_data_buffer, along with some
testing.
For more information, refer to the wiki page:
https://wiki.asterisk.org/wiki/display/AST/WebRTC+User+Experience+Improvements
ASTERISK-27810 #close
Change-Id: Idab644b08a1593659c92cda64132ccc203fe991d