rate-o-mat multi-master mode requires statement-based
replication. it was however disarmed in the past.
Change-Id: I9896c380626640e27150e55c6eec3fbf39ceecdf
We have randomly failing tests (see e.g.
https://jenkins.mgm.sipwise.com/job/rate-o-mat-tests-docker/46684/testReport/):
| <init>.rateomat-10-prepaid-costs_t 0 ms 4
| rateomat-10-prepaid-costs_t.40 - source_customer_cost = 150.000000 0 ms 1
| rateomat-10-prepaid-costs_t.41 - rating_status = ok 0 ms 1
| rateomat-10-prepaid-costs_t.43 - prepaid cost record does not exist 0 ms 1
| rateomat-10-prepaid-costs_t.75 - rating_status = ok 0 ms 1
| rateomat-10-prepaid-costs_t.77 - source_customer_cost = 150.000000 0 ms 1
| rateomat-10-prepaid-costs_t.78 - rating_status = ok 0 ms 1
| rateomat-10-prepaid-costs_t.80 - prepaid cost record does not exist 0 ms 1
| rateomat-10-prepaid-costs_t.73 - source_customer_cost = 150.000000 0 ms 2
| rateomat-10-prepaid-costs_t.76 - prepaid cost record does not exist 0 ms 3
| rateomat-10-prepaid-costs_t.52 - cdrs were all processed 0 ms 4
| rateomat-10-prepaid-costs_t.81 - cdrs were all processed 0 ms 4
This seems to be caused by timing issues on our build infrastructure,
being caused by rateomat getting stopped before it finished
processing. Give the prepaid-costs tests some further time to finish,
at least until we have a better way to handle this situation.
Change-Id: I0320f10a0814e1d51732264ddb6f9940c40ae4b9
when repl is stuck (for whatever reason), each rate-o-mat
instance basically should proceed processing it's local
CDRs.
we observed it stalls though in that case.
the reason is because currently nodeA rate-o-mat is doing
even cdr IDs and nodeB is doing odd cdr IDs.
it has to be vice versa for the current offset configuration
in my.cnf. we might want to read the offset configuration
directly from the .cnf in a next step.
Change-Id: I0d6bafbd33f0d3370393a8cf1a763a7963bc017f
replication can nowadays fail with inserting regular autoinc
id when replicating.
until this can be solved, we provide an option to switch
back to rate-o-mat running on the active node only.
Change-Id: I843899d4ffa0b89e41868673b90b7b0a9c14ef90
IODKU statements can fail for master-master replication.
switching to 'STATEMENT' binlog_format to mitigate.
Change-Id: I5e7d2438c6338e6cc7b17bf9c51a8a4712f72454
This prevents log messages from being buffered if
output goes to a pipe (i.e. consumed by journald).
Without autoflush, STDOUT logs are buffered due to
journald which is buffering Perl STDOUT by default.
In the same time STDERR logs are flushed immediately
by Perl which is confusing as log lines are mixed.
Change-Id: I5a2d07285a99c5628ae69b87014d9919433f816d
Normally duplication DB credentials are missing on NGCP,
generating warning here for no reason confuses newbies
and produce proubles with tracing as warnings are printed
immediately, while infos is cached.
Change-Id: I1b29c62566f162d797321f08a1773f7e52f61ef0
The function is connect_billdbh(), but rate-o-mat is
printing copy&paste message from connect_dupdbh().
Change-Id: I4fc0e0ab48b78335c455cb3924f31cd7dc212495
On (re)start, Rate-o-mat was starting with 0 delay interval.
It produced wave of load on DB: 1) SQL queries + 2) constant DB
connect+disconnect until delay internal is grow big enough
(+0.02 sec on every run).
In fact rate-o-mat was reconnecting DB 250+ times the first 5 minutes
after service restart. Meanwhile the same time DB executed tons of
queries to find no CDRs to process. It also consumes CPU on a DB host.
> rg Starting /var/log/ngcp/rate-o-mat.log
> Jul 1 11:08:53 spce (info) ngcp-rate-o-mat[148063]: INFO: Starting rate-o-mate.
> root@spce:~# for min in $(seq -f "%02g" 08 34) ; do \
> rg "11:${min}" /var/log/ngcp/rate-o-mat.log | \
> rg -c "\[148063\]: INFO: Batch of" ; \
> done
> 35 # rate-o-mat looped 35 times in minute 11:08 (7 seconds left in use)
> 70 # rate-o-mat looped 70 times in minute 11:09 (60 seconds)
> 46 # rate-o-mat looped 46 times in minute 11:10 (60 seconds)
> 35 # rate-o-mat looped 35 times in minute 11:11 (60 seconds)
> 35 # ...
> 24
> 23
> 23
> 12
> 12
> 11
> ...
With the fix, rate-o-mat will start with delay interval 10 seconds
(will be looped 6 time every minute) and will decrease interval if
more CDRs detected to process.
Change-Id: I653e49a352ff38d8a5ca70098147095e05a0ed57
In rate-o-mat-tests-docker job we run code from docker container where
this file is missing.
We need this file to get ngcp_hostname so rate-o-mat can work in
parallel - sp1 node processes CDRs with odd IDs and sp2 processes CDRs
with even IDs. In case of unknown ngcp_hostname it rate-o-mat just
processes all the CDRs so no malfunction here.
Change-Id: Ica67cb1a9febbf322fa783a4fe3d1a3762dae672
Sometimes we have 12 failing tests in rateomat-10-prepaid-costs,
being caused by rateomat getting stopped before it finished
processing. Give the prepaid-costs tests some more time to finish.
Change-Id: I525b5da17f46616d9b0ed3a5fc9c13035332f482