Let's restore system state of /run/systemd/system for
VBoxLinuxAdditions, to avoid any unexpected side effects.
Followup for git rev 8601193
Change-Id: I632c7d60ebb627c3a80d4c1f9b264d6d0a13b4f1
Recent Grml ISOs, including our Grml-Sipwise ISO (v2023-06-01), include
grml-autoconfig v0.20.3 which execute the grml-autoconfig service under
`StandardInput=null`. This is necessary to not conflict with tty usage,
like used with serial console. See
1e268ffe4f
Now that we run with /dev/null for stdin, we can't interact with the
user, so let's try to detect when running from within grml-autoconfig's
systemd unit, and if so assume that we're executing on /dev/tty1 and
use/reopen that for stdin.
Change-Id: Id55283c7f862487a6ef8acb8ab01f67a05bd8dd7
As of git rev 6c960afee4 we're using the
virtualbox-guest-additions-iso from bookworm.
Previous versions of VBoxGuestAdditions had a simple test to check for
present of systemd, quoting from
/opt/VBoxGuestAdditions-6.1.22/routines.sh:
| use_systemd()
| {
| test ! -f /sbin/init || test -L /sbin/init
| }
Now in more recent versions of VBoxGuestAdditions[1], the systemd check
was modified, quoting from /opt/VBoxGuestAdditions-7.0.6/routines.sh:
| use_systemd()
| {
| # First condition is what halfway recent systemd uses itself, and the
| # other two checks should cover everything back to v1.
| test -e /run/systemd/system || test -e /sys/fs/cgroup/systemd || test -e /cgroup/systemd
| }
So if we're running inside a chroot as with our deployment.sh, it looks
like a non-systemd system for VBoxGuestAdditions's installer, and we end
up with installation and presence of /etc/init.d/vboxadd, leading to:
| root@spce:~# ls -lah /run/systemd/generator.late/
| total 4.0K
| drwxr-xr-x 4 root root 100 Jul 18 00:20 .
| drwxr-xr-x 23 root root 580 Jul 18 00:20 ..
| drwxr-xr-x 2 root root 60 Jul 18 00:20 graphical.target.wants
| drwxr-xr-x 2 root root 60 Jul 18 00:20 multi-user.target.wants
| -rw-r--r-- 1 root root 537 Jul 18 00:20 vboxadd.service
|
| root@spce:~# systemctl cat vboxadd.service
| # /run/systemd/generator.late/vboxadd.service
| # Automatically generated by systemd-sysv-generator
|
| [Unit]
| Documentation=man:systemd-sysv-generator(8)
| SourcePath=/etc/init.d/vboxadd
| Description=LSB: VirtualBox Linux Additions kernel modules
| Before=multi-user.target
| Before=multi-user.target
| Before=multi-user.target
| Before=graphical.target
| Before=display-manager.service
|
| [Service]
| Type=forking
| Restart=no
| TimeoutSec=5min
| IgnoreSIGPIPE=no
| KillMode=process
| GuessMainPID=no
| RemainAfterExit=yes
| SuccessExitStatus=5 6
| ExecStart=/etc/init.d/vboxadd start
| ExecStop=/etc/init.d/vboxadd stop
We don't expect any init scripts to be present, as all our services must
have systemd unit files. Therefore we check for absence of systemd's
/run/systemd/generator.late in our system-tests, which started to fail
with the upgrade to VBoxGuestAdditions-v7.0.6 due to the systemd
presence detection mentioned above.
Let's fake presence of systemd before invoking VBoxGuestAdditions's
installer, to avoid ending up with unexpected vbox* init scripts.
[1] See svn rev 92682:
https://www.virtualbox.org/browser/vbox/trunk/src/VBox/Installer/linux/routines.sh?rev=92682https://www.virtualbox.org/changeset?old=92681&old_path=vbox%2Ftrunk%2Fsrc%2FVBox%2FInstaller%2Flinux%2Froutines.sh&new=92682&new_path=vbox%2Ftrunk%2Fsrc%2FVBox%2FInstaller%2Flinux%2Froutines.sh
Change-Id: Ifd11460e3a8fd4f4c1269453a9b8376065861b8e
Support bookworm option in DEBIAN_RELEASE selection. We have support
for it already.
Use bookworm as fallback since nowadays we jumped to it.
Change-Id: I118c1b5cf81fe57394495b5f745fc81032406c78
To be able to upgrade our internal systems to Debian/bookworm
we need to have puppet packages available.
Upstream still doesn't provide any Debian packages
(see https://tickets.puppetlabs.com/browse/PA-4995),
though their AIO (All In One) packages for Debian/bullseye
seem to be working on Debian/bookworm as well (at least for
puppet-agent). So until we either migrated to puppet-agent
as present in Debian/bookworm or upstream provides according
AIO packages, let's use the puppet-agent packages we already
use for our Debian/bullseye systems.
Change-Id: I2211ffd79f70a2a79873e737b0b512bfb7492328
Since version 1.20.0, dpkg no longer creates /var/lib/dpkg/available
(see #647911). Now that we upgraded our Grml-Sipwise deployment system
to bookworm, we have dpkg v1.21.22 on our live system, and mmdebstrap
relies on dpkg of the host system for execution.
But on Debian releases until and including buster, dpkg fails to operate
with e.g. `dpkg --set-selections`, if /var/lib/dpkg/available doesn't
exist:
| The following NEW packages will be installed:
| nullmailer
| [...]
| debconf: delaying package configuration, since apt-utils is not installed
| dpkg: error: failed to open package info file '/var/lib/dpkg/available' for reading: No such file or directory
We *could* also switch from mmdebstrap to debootstrap for deploying
Debian releases <=buster, but this would be slower and we use mmdebstrap
since quite some time for everything. So instead let's create
/var/lib/dpkg/available after bootstrapping the system.
Reported towards mmdebstrap as #1037946.
Change-Id: I0a87ca255d5eb7144a9c093051c0a6a3114a3c0b
Now that our deployment system is based on Debian/bookworm, but our
gerrit/git server still runs on Debian/bullseye, we run into the OpenSSH
RSA issue (RSA signatures using the SHA-1 hash algorithm got disabled by default), see
https://michael-prokop.at/blog/2023/06/11/what-to-expect-from-debian-bookworm-newinbookworm/
and https://www.jhanley.com/blog/ssh-signature-algorithm-ssh-rsa-error/
We need to enable ssh-rsa usage, otherwise deployment fails with:
| Warning: Permanently added '[gerrit.mgm.sipwise.com]:29418' (ED25519) to the list of known hosts.
| sign_and_send_pubkey: no mutual signature supported
| puppet-r10k@gerrit.mgm.sipwise.com: Permission denied (publickey).
| fatal: Could not read from remote repository.
Change-Id: I5894170dab033d52a2612beea7b6f27ab06cc586
Deploying the Debian/bookworm based NGCP system fails on a Lenovo sr250
v2 node with an Intel E810 network card:
| # lshw -c net -businfo
| Bus info Device Class Description
| =======================================================
| pci@0000:01:00.0 eth0 network Ethernet Controller E810-XXV for SFP
| pci@0000:01:00.1 eth1 network Ethernet Controller E810-XXV for SFP
| # lshw -c net
| *-network:0
| description: Ethernet interface
| product: Ethernet Controller E810-XXV for SFP
| vendor: Intel Corporation
| physical id: 0
| bus info: pci@0000:01:00.0
| logical name: eth0
| version: 02
| serial: [...]
| size: 10Gbit/s
| capacity: 25Gbit/s
| width: 64 bits
| clock: 33MHz
| capabilities: pm msi msix pciexpress vpd bus_master cap_list rom ethernet physical fibre 1000bt-fd 25000bt-fd
| configuration: autonegotiation=off broadcast=yes driver=ice driverversion=1.11.14 duplex=full firmware=2.25 0x80007027 1.2934.0 ip=192.168.90.51 latency=0 link=yes multicast=yes port=fibre speed=10Gbit/s
| resources: iomemory:400-3ff iomemory:400-3ff irq:16 memory:4002000000-4003ffffff memory:4006010000-400601ffff memory:a1d00000-a1dfffff memory:4005000000-4005ffffff memory:4006220000-400641ffff
We set up the /etc/network/interfaces file by invoking Grml's
netcardconfig script in automated mode, like:
NET_DEV=eth0 METHOD=static IPADDR=192.168.90.51 NETMASK=255.255.255.248 GATEWAY=192.168.90.49 /usr/sbin/netcardconfig
The resulting /etc/network/interfaces gets used as base for usage inside
the NGCP chroot/target system. netcardconfig shuts down the network
interface (eth0 in the example above) via ifdown, then sleeps for 3
seconds and re-enables the interface (via ifup) with the new
configuration.
This used to work fine so far, but with the Intel e810 network card and
kernel version 6.1.0-9-amd64 from Debian/bookworm we see a link failure
and it takes ~10 seconds until the network device is up and running
again. The following vagrant_configuration() execution from
deployment.sh then fails:
| +11:41:01 (netscript.grml:1022): vagrant_configuration(): wget -O /var/tmp/id_rsa_sipwise.pub http://builder.mgm.sipwise.com/vagrant-ngcp/id_rsa_sipwise.pub
| --2023-06-11 11:41:01-- http://builder.mgm.sipwise.com/vagrant-ngcp/id_rsa_sipwise.pub
| Resolving builder.mgm.sipwise.com (builder.mgm.sipwise.com)... failed: Name or service not known.
| wget: unable to resolve host address 'builder.mgm.sipwise.com'
However, when we retry it again just a bit later, the network works fine
again. During investigation we identified that the network card flips
the port, quoting the related log from the connected Cisco nexus 5020
switch (with fast stp learning mode):
| nexus5k %ETHPORT-5-IF_DOWN_LINK_FAILURE: Interface Ethernet1/33 is down (Link failure)
It seems to be related to some autonegotiation problem, as when we
execute `ethtool -A eth0 rx on tx on` (no matter whether with `on` or
`off`), we see:
| [Tue Jun 13 08:51:37 2023] ice 0000:01:00.0 eth0: Autoneg did not complete so changing settings may not result in an actual change.
| [Tue Jun 13 08:51:37 2023] ice 0000:01:00.0 eth0: NIC Link is Down
| [Tue Jun 13 08:51:45 2023] ice 0000:01:00.0 eth0: NIC Link is up 10 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: NONE, Autoneg Advertised: On, Autoneg Negotiated: False, Flow Control: Rx/Tx
FTR:
| root@sp1 ~ # ethtool -A eth0 autoneg off
| netlink error: Operation not supported
| 76 root@sp1 ~ # ethtool eth0 | grep -C1 Auto-negotiation
| Duplex: Full
| Auto-negotiation: off
| Port: FIBRE
| root@sp1 ~ # ethtool -A eth0 autoneg on
| root@sp1 ~ # ethtool eth0 | grep -C1 Auto-negotiation
| Duplex: Full
| Auto-negotiation: off
| Port: FIBRE
| root@sp1 ~ # dmesg -T | tail -1
| [Tue Jun 13 08:53:26 2023] ice 0000:01:00.0 eth0: To change autoneg please use: ethtool -s <dev> autoneg <on|off>
| root@sp1 ~ # ethtool -s eth0 autoneg off
| root@sp1 ~ # ethtool -s eth0 autoneg on
| netlink error: link settings update failed
| netlink error: Operation not supported
| 75 root@sp1 ~ #
As a workaround, at least until we have a better fix/solution, we try to
reach the default gateway (or fall back to the repository host if
gateway couldn't be identified) via ICMP/ping, and once that works we we
continue as usual. But even if that should fail we continue execution,
to minimize behavior change but have a workaround for this specific
situation available.
FTR, broken system:
| root@sp1 ~ # ethtool -i eth0
| driver: ice
| version: 6.1.0-9-amd64
| firmware-version: 2.25 0x80007027 1.2934.0
| [...]
Whereas with kernel 5.10.0-23-amd64 from Debian/bullseye we don't seem
to see that behavior:
| root@sp1:~# ethtool -i neth0
| driver: ice
| version: 5.10.0-23-amd64
| firmware-version: 2.25 0x80007027 1.2934.0
| [...]
Also using latest available ice v1.11.14 (from
https://sourceforge.net/projects/e1000/files/ice%20stable/1.11.14/)
on Kernel version 6.1.0-9-amd64 doesn't bring any change:
| root@sp1 ~ # modinfo ice
| filename: /lib/modules/6.1.0-9-amd64/updates/drivers/net/ethernet/intel/ice/ice.ko
| firmware: intel/ice/ddp/ice.pkg
| version: 1.11.14
| license: GPL v2
| description: Intel(R) Ethernet Connection E800 Series Linux Driver
| author: Intel Corporation, <linux.nics@intel.com>
| srcversion: 818E9C817731C98A25470C0
| alias: pci:v00008086d00001888sv*sd*bc*sc*i*
| [...]
| alias: pci:v00008086d00001591sv*sd*bc*sc*i*
| depends: ptp
| retpoline: Y
| name: ice
| vermagic: 6.1.0-9-amd64 SMP preempt mod_unload modversions
| parm: debug:netif level (0=none,...,16=all) (int)
| parm: fwlog_level:FW event level to log. All levels <= to the specified value are enabled. Values: 0=none, 1=error, 2=warning, 3=normal, 4=verbose. Invalid values: >=5
| (ushort)
| parm: fwlog_events:FW events to log (32-bit mask)
| (ulong)
| root@sp1 ~ # ethtool -i eth0 | head -3
| driver: ice
| version: 1.11.14
| firmware-version: 2.25 0x80007027 1.2934.0
| root@sp1 ~ #
Change-Id: Ieafe648be4e06ed0d936611ebaf8ee54266b6f3c
Re-reading of disks fails if the mdadm SW-RAID device is still active:
| root@sp1 ~ # cat /proc/mdstat
| Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
| md0 : active raid1 sdb3[1] sda3[0]
| 468218880 blocks super 1.2 [2/2] [UU]
| [========>............] resync = 42.2% (197855168/468218880) finish=22.4min speed=200756K/sec
| bitmap: 3/4 pages [12KB], 65536KB chunk
|
| unused devices: <none>
| root@sp1 ~ # blockdev --rereadpt /dev/sdb
| blockdev: ioctl error on BLKRRPART: Device or resource busy
| 1 root@sp1 ~ # blockdev --rereadpt /dev/sda
| blockdev: ioctl error on BLKRRPART: Device or resource busy
| 1 root@sp1 ~ #
Only if we stop the mdadm SW-RAID device, then we can re-read the
partition table:
| root@sp1 ~ # mdadm --stop /dev/md0
| mdadm: stopped /dev/md0
| root@sp1 ~ # blockdev --rereadpt /dev/sda
| root@sp1 ~ #
This behavior isn't new and unrelated to Debian/bookworm but was spotted
while debugging an unrelated issue.
FTR: we re-read the partition table (via `blockdev --rereadpt`) to ensure
that /etc/fstab of the live system is up2date and matches the current
system state. While this isn't stricly needed, we preserve existing
behavior and also try to avoid a hard "cut" of a possibly ongoing
SW-RAID sync.
Change-Id: I735b00423e6efa932f74b78a38ed023576e5d306
With our newer Grml-Sipwise ISO (v2023-06-01) being based on
Debian/bookworm and recent Grml packages, our automated deployment
suddenly started to fail for us:
| +04:28:12 (netscript.grml:2453): echo 'Successfully finished deployment process [Fri Jun 2 04:28:12 UTC 2023 - running 576 seconds]'
| ++04:28:12 (netscript.grml:2455): get_deploy_status
| ++04:28:12 (netscript.grml:95): get_deploy_status(): '[' -r /srv/deployment//status ']'
| ++04:28:12 (netscript.grml:96): get_deploy_status(): cat /srv/deployment//status
| Successfully finished deployment process [Fri Jun 2 04:28:12 UTC 2023 - running 576 seconds]
| +04:28:12 (netscript.grml:2455): '[' copylogfiles '!=' error ']'
| +04:28:12 (netscript.grml:2456): set_deploy_status finished
| +04:28:12 (netscript.grml:103): set_deploy_status(): '[' -n finished ']'
| +04:28:12 (netscript.grml:104): set_deploy_status(): echo finished
| +04:28:12 (netscript.grml:2459): false
| +04:28:12 (netscript.grml:2463): status_wait
| +04:28:12 (netscript.grml:329): status_wait(): [[ -n 0 ]]
| +04:28:12 (netscript.grml:329): status_wait(): [[ 0 != 0 ]]
| +04:28:12 (netscript.grml:2466): false
| +04:28:12 (netscript.grml:2471): false
| +04:28:12 (netscript.grml:2476): echo 'Do you want to [r]eboot or [h]alt the system now? (Press any other key to cancel.)'
| Do you want to [r]eboot or [h]alt the system now? (Press any other key to cancel.)
| +04:28:12 (netscript.grml:2477): unset a
| +04:28:12 (netscript.grml:2478): read -r a
| ++04:28:12 (netscript.grml:2478): wait_exit
| ++04:28:12 (netscript.grml:339): wait_exit(): local e_code=1
| ++04:28:12 (netscript.grml:340): wait_exit(): [[ 1 -ne 0 ]]
| ++04:28:12 (netscript.grml:341): wait_exit(): set_deploy_status error
| ++04:28:12 (netscript.grml:103): set_deploy_status(): '[' -n error ']'
| ++04:28:12 (netscript.grml:104): set_deploy_status(): echo error
| ++04:28:12 (netscript.grml:343): wait_exit(): trap '' 1 2 3 6 15 ERR EXIT
| ++04:28:12 (netscript.grml:344): wait_exit(): status_wait
| ++04:28:12 (netscript.grml:329): status_wait(): [[ -n 0 ]]
| ++04:28:12 (netscript.grml:329): status_wait(): [[ 0 != 0 ]]
| ++04:28:12 (netscript.grml:345): wait_exit(): exit 1
As of grml-autoconfig v0.20.3 and newer, the grml-autoconfig systemd service
that invokes the deployment netscript uses `StandardInput=null` instead of
`StandardInput=tty` (see https://github.com/grml/grml/issues/176).
Thanks to this, a logic error in our deployment script showed up. We
exit the script in interactive mode, though only *afterwards* prompting
for reboot/halt with `read -r a` - which of course fails if stdin is
missing. As a result, we end up in our signal handler `trap 'wait_exit;'
1 2 3 6 15 ERR EXIT` and then fail the deployment.
So instead prompt for "Do you want to [r]eboot or [h]alt ..." *only* in
interactive mode, and while at it drop the "if "$INTERACTIVE" ; then
exit 0 ; fi" so the prompt is actually presented to the user.
Change-Id: Ia89beaf3c446f3701cc30ab21cfdff7b5808a6d3
Manual execution of python's http.server has multiple drawbacks, like no
proper logging and no service tracking/restart options, but most notably
the deployment status server no longer runs when our deployment script
fails.
While /srv/deployment/status then still might contain "error", no one is
serving that information on port 4242 any longer[1], and our
daily-build-install-vm Jenkins job might then report:
| VM '192.168.209.162' current state is '' - retrying up to another 1646 times, sleeping for a second
| VM '192.168.209.162' current state is '' - retrying up to another 1645 times, sleeping for a second
| [...]
It then runss for ~1/2 hour without doing anything useful, until the
Jenkins job itself gives up.
By running our deployment status server under systemd, we keep the
service alive also when the deployment script terminates. In case of
errors we get immediate feedback:
| VM '192.168.209.162' current state is 'puppet' - retrying up to another 1648 times, sleeping for a second
| VM '192.168.209.162' current state is 'puppet' - retrying up to another 1647 times, sleeping for a second
| VM '192.168.209.162' current state is 'error' - retrying up to another 1646 times, sleeping for a second
| + '[' error '!=' finished ']'
| + echo 'Failed to install Proxom VM '\''162'\'' (IP '\''192.168.209.162'\'')'
[1] For our NGCP based installations we use the ngcpstatus boot option,
where its status_wait trap kicks in and avoids premature exit of
deployment status server. But e.g. our non-NGCP systems don't use that
boot option and with this change we could get rid of the status_wait
overall.
Change-Id: Ibaa799358caedf31c64c37b48e3c5e889808086a
Use system-tools' ngcp-initialize-udev-rules-net script to
deploy the /etc/udev/rules.d/70-persistent-net.rules, no need
to maintain code at multiple places.
Change-Id: I81925262a8c687aa9976cbc1113568989fa53281
When building our Debian boxes for buster, bullseye + bookworm (via
daily-build-matrix-debian-boxes Jenkins job), we get broken networking,
so e.g. `vagrant up debian-bookworm doesn't work.
This is caused by /etc/network/interfaces (using e.g. "neth0", being our
naming schema which we use in NGCP, as adjusted by the deployment
script) not matching the actual system network devices (like enp0s3).
TL;DR: no behavior change for NGCP systems, only when building non-NGCP
systems then enable net.ifnames=0 (via set_custom_grub_boot_options),
but do *not* generate /etc/udev/rules.d/70-persistent-net.rules (via
invoke generate_udev_network_rules) nor rename eth*->neth* in
/etc/network/interfaces.
More verbose version:
* rename the "eth*" networking interfaces into "neth*" in
/etc/network/interfaces only when running in ngcp-installer mode
(this is the behavior we rely on in NGCP, but it doesn't matter
for plain Debian systems)
* generate /etc/udev/rules.d/70-persistent-net.rules only when running
in ngcp-installer mode. While our jenkins-configs.git's
jobs/daily-build/scripts/vm_clean-fs.sh removes the file anyways (for
the VM use case), between the initial deployment run and the next reboot
the configuration inside the PVE VM still applies, so we end up with
an existing /etc/udev/rules.d/70-persistent-net.rules, referring to
neth0, while our /etc/network/interfaces configures eth0 instead.
* when *not* running in ngcp-installer mode, enable net.ifnames=0 usage
in GRUB to disable persistent network interface naming. FTR, this
change is *not* needed for NGCP, as on NGCP systems we use
/etc/udev/rules.d/70-persistent-net.rules, generated by
ngcp-system-tools' ngcp-initialize-udev-rules-net script also in VM
use case
This is a fixup for a change in git commit a50903a30c (see also commit
message of git commit ab62171), that should have been adjusted for
ngcp-installer-only mode instead.
Change-Id: I6d0021dbdc2c1587127f0e115c6ff9844460a761
If the date of the running system isn't appropriate enough, then apt
runs might fail with somehint like:
| E: Release file for https://deb/sipwise/com/spce/mr10.5.2/dists/bullseye/InRelease is not valid yet (invalid for another 6h 19min 2s)
So let's try to sync date/time of the system via NTP. Given that chrony
is a small (only 650 kB disk space) and secure replacement for ntp,
let's ship chrony with the Grml deployment ISO (and fall back to ntp
usage in deployment script if chrony shouldn't be available).
Also, if the system is configured to read the RTC time in the local time
zone, this is known as another source of problems, so let's make sure to
use the RTC in UTC.
Change-Id: I747665d1cee3b6f835c62812157d0203bcfa96e2
For deploying Debian/bookworm (see MT#55524), we'd like to have an
updated Grml ISO. With such a Debian/bookworm based live system, we can
still deploy older target systems (like Debian/bullseye).
Relevant changes:
1) Ad jo as new build-dependency, to generate build information in
conf/buildinfo.json (new dependency of grml-live)
2) Always include ca-certificates, as this is required with more recent
mmdebstrap versions (>=0.8.0), when using apt repositories with
https, otherwise bootstrapping Debian fails.
3) Update to latest stable grml-live version v0.42.0, which:
a) added support for "bookworm" as suite name
cff66073a7
b) provides corresponding templates for memtest support:
c01a86b3fc
c) and a workaround for a kmod/initramfs-tools issue with PXE/NFS boot:
ea1e5ea330
4) Update memtest86+ to v6.00-1 as present in Debian/bookworm and
add corresponding UEFI support (based on grml-live's upstream change,
though as we don't support i386, dropped the 32bit related bits)
Change-Id: I327c0e25c28f46e097212ef4329d75fc8d34767c
We build the pre-loaded library targeting a specific Debian release,
which might be different (and newer) to the release Grml was built for.
This can cause missing versioned symbols (and a loading failure) if the
libc in the outer system is older than the inner system.
Change-Id: I84f4f307863e534fe0fff85274ae1d5db809012c
Git commit 6661b04af0 broke all our bullseye based builds
(debian, sipwise + docker), see
https://jenkins.mgm.sipwise.com/view/All/job/daily-build-matrix-debian-boxes/
For plain Debian installations we don't have SP_VERSION available,
so default to what was used before supporting trunk-weekly next
to trunk.
Change-Id: I61958f0c67d165d2f6dcb059fe4991ed24a328c9
We want to be able to track down any left-behind tmp files,
so ensure we're creating them with according file names.
Change-Id: I4eb44047f2eb86ba9f0a8aeeb8d6555290f60c00
It's needed for support of spN nodes.
Sort options in deployment.sh.
Remove unused boot options ngcpnonwrecfg and ngcpfillcache.
Change-Id: I300e533c15b71d65e768ca2ed4b3a73eb7ec6954
Merge all options parsing to single point.
Move options parsing to the top of the script.
Parse boot options first then cmd options if they exist.
Simplify some checks.
Remove unused options.
Change-Id: Ibcb099d9bb2ba26ffed9904c8e5065b392ecb78a
Sort default values.
Rework cmd parameters parsing - remove some reassign, reformat
to be more clear, etc.
Add some default options CROLE, EADDR, EXTERNAL_NETMASK, ROLE.
Change-Id: I287facafeb53dc5390517424935c8a50932246dc
If grml-debootstrap detects an existing FAT filesystem on the EFI partition,
it doesn't modify/re-create it:
| EFI partition /dev/nvme0n1p2 seems to have a FAT filesystem, not modifying.
The underlying check is execution of `fsck.vfat -bn $DEVICE`.
Now with fsck.fat from dosfstools v4.1-2 as present in Debian/buster we got:
| root@grml ~ # fsck.vfat -bn /dev/nvme0n1p2
| fsck.fat 4.1 (2017-01-24)
| 0x41: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.
| Automatically removing dirty bit.
| There are differences between boot sector and its backup.
| This is mostly harmless. Differences: (offset:original/backup)
| 0:00/eb, 82:00/46, 83:00/41, 84:00/54, 85:00/33, 86:00/32, 87:00/20
| , 88:00/20, 89:00/20, 510:00/55, 511:00/aa
| Not automatically fixing this.
| Leaving filesystem unchanged.
| 1 root@grml ~ #
Now with dosfstools v4.2-1 as present in Debian/bullseye, this might become:
| root@grml ~ # fsck.vfat -bn /dev/nvme0n1p2
| fsck.fat 4.2 (2021-01-31)
| There are differences between boot sector and its backup.
| This is mostly harmless. Differences: (offset:original/backup)
| 0:00/eb, 65:01/00, 82:00/46, 83:00/41, 84:00/54, 85:00/33, 86:00/32
| , 87:00/20, 88:00/20, 89:00/20, 510:00/55, 511:00/aa
| Not automatically fixing this.
In such situations we end up with an incomplete/broken EFI partition,
which breaks within our efivarfs post-script:
| Mounting /dev/nvme0n1p2 on /boot/efi
| mount: /boot/efi: wrong fs type, bad option, bad superblock on /dev/nvme0n1p2, missing codepage or helper program, or other error.
| ESC[31;01m-> Failed (rc=1)ESC[0m
| ESC[32;01m*ESC[0m Removing chroot-script again
| ESC[32;01m*ESC[0m Executing post-script /etc/debootstrap/post-scripts//efivarfs
| Executing /etc/debootstrap/post-scripts//efivarfs
| Mounting /dev (via bind mount)
| Mounting /boot/efi
| mount: /boot/efi: special device UUID= does not exist.
Change-Id: I46939b4e191982a84792f3aca27c6cc415dbdaf4
When we run current versions of deployment.sh, which include the fix
from commit f9aea18c, in combination with grml-debootstrap <=0.96 (as
shipped by our Grml deployment ISO version sipwise20210511), deployments
using EFI might fail with:
| Mounting /dev/nvme0n1p2 on /boot/efi
| Invoking efibootmgr
| EFI variables are not supported on this system.
| -> Failed (rc=1)
| [...]
| Mounting /dev (via bind mount)
| Mounting efivarfs on /sys/firmware/efi/efivars
| Invoking grub-install with proper EFI environment
| chroot: failed to run command 'grub-install': No such file or directory
| -> Failed (rc=127)
This is caused by a failing invocation of efibootmgr from within
grml-debootstrap (versions <=0.96 and running with Debian kernel
>=5.10), causing grml-debootstrap to exit then. As a result, the EFI
specific GRUB steps in grml-debootstrap's grub_install() from within
chroot-script doesn't get executed. Therefor the grub-efi-amd64 package
is missing for usage by our efivarfs post-script.
By re-introducing the efivarfs pre-script from commit 535e6df3
we can work around this bug.
Furthermore, when /boot/efi should be mounted within the target system
by our efivarfs post-script, it might fail when /proc isn't available, like:
| # chroot /mnt mount /boot/efi
| mount: /boot/efi: can't find UUID=FE60-5B75.
This can be fixed by ensuring to mount /proc, /sys etc *before*
/boot/efi. Then scanning for the UUID device (as configured in
/etc/fstab) works as expected.
While at it fix a comment regarding grml-debootstrap >=v0.97 vs >=v0.99,
as only v0.99 behaves as expected with our EFI requirements.
Change-Id: I9db677a06f7e161f971743fc18b034ad3191a449
This is a followup fixup for commit 535e6df / Change-Id: I5374322cb0a39cfed6563df6c4c30f1eafe560c1
We had to apply fixes due to efivars vs efivarfs in kernel versions
>=5.10, and addressed them in commit 535e6df. Those changes were
incomplete though, as the fix included in grml-debootstrap v0.97 is
incomplete: while efibootmgr was properly invoked and working,
invocation of grub-install doesn't reliably work (as at that time
/sys/firmware/efi/efivars is no longer accessible). GRUB installation on
EFI systems without /sys/firmware/efi/efivars present warns with "EFI
variables are not supported on this system" (see
https://sources.debian.org/src/grub2/2.04-20/debian/patches/efi-variable-storage-minimise-writes.patch/?hl=650#L650),
though returns with exit code 0. This leaves us with an incomplete and
therefore not booting GRUB EFI environment.
This used to work with mr9.5.1 only, because there we install(ed)
systems using grml-debootstrap v0.96, which is *older* than the version
v0.97 (which included the EFI workaround) we check for in deployment.sh.
Since the grml-debootstrap version v0.96 isn't recent enough there, we
applied the fallback to our local scripts, which took care of proper
installation of GRUB in EFI environments.
On the other side, in recent trunk deployments we have grml-debootstrap
v0.98 available, which includes the EFI workaround - therefore our local
scripts aren't applied. The resulting installation is incomplete, and
recent trunk deployments fail to boot in EFI environments.
The according fix for grml-debootstrap has been made and is going to be
released in the next few days as v0.99. But to ensure that it's working
also with older grml-debootstrap versions (and we don't have to rebuild
our squashfs environments), the local scripts have been adjusted.
We don't even need any pre-script at all, instead we handle all of the
GRUB EFI installation through /etc/debootstrap/post-scripts/efivarfs.
FTR: this issue didn't show up on certain test systems of us, because
SW-RAID is used there. In deployment.sh we have special handling
of SW-RAID regarding efibootmgr and grub-install, see line 2330 ff.
Change-Id: Ifa90fbfab7d69bc331acfec15a6cc9318c84ee8f
Jobs like daily-build-matrix-debian-boxes build plain Debian machines,
not NGCP-based ones. At the moment we're generating the udev-rules for
network renaming unconditionally, so we have to do it consistently,
either both conditionally and not for "plain" systems, or both
unconditionally, so network can be brought up by a correct
/etc/network/interfaces after the devices are brought up with the new
names.
There is a good-ish argument for keeping using eth0, as it is more of a
default, but we're already deviating from the default for several years
and Debian stable releases by having these names and not ones like
"ens18" or "enp4s0f2" which is the default in Debian nowadays, at least
since buster.
So it is probably better to keep it consistent with our other machines
and use "neth*" naming for those too.
Change-Id: I6b3b49a1769894580df768abb817ae5196e65963
The code removed was enabled when $VAGRANT=true, and this happened when
passing "vagrant" parameter to deployment.sh, which is done in places
like proxmox-vm-clone job, the base of many of our tests machines.
VMs do not necessarily have the same hardware configuration, so removing
udev-rules for network devices makes sense in principle. Especially
when since the beginning we were using network devices named "eth*"
everywhere, even if in the last years we had to use net.ifnames=0 and
udev-rules files in hardware to keep using "eth*" names.
However, now with mr9.5 and the move to Debian bullseye we have to start
using different names, and we settled on the direct translation to
"neth*". So we need a way to assign whatever network devices the
machines come with, including VMs, to names "neth*".
(If we used the new-permanent device names like ens18 or enp3s0f1 we
would have to adapt network.yml and files like network interface, and
they would be different across all the different machines (HW and VM) so
this is not a better or faster solution to the problem.)
So, back to the topic of removal of this udev-rules file: in many cases
in our test infra, the machines are built "in place" and then rebooted
for upgrades or tests, in princicple with the same hardware
configuration, so there is no need to remove these files.
In cases where the underlying (virtualized) hardware changes, e.g. to
use like local VirtualBox-based vagrant machines, we will need to adapt
the rules for the existing devices.
Change-Id: I57e39a2ec6849f3b5bb8f6cf518e2a2923ec19cb
Using "eth*" names was discouraged for many years, we've been finding
problems here and there and working around them with the help of
udev-rules (/etc/udev/rules.d/70-persistent-net.rules) to map address
interfaces according to PCIIDs, using "net.ifnames=0" as Linux kernel
boot parameter when booting in GRUB, etc.
Finally we found unsurmountable problems when moving to Debian bullseye
(mr9.5), because as we attempt to rename interfaces in some hardware
systems that we use, we got race conditions and clashes with renaming
that we could not solve in other ways.
We had different alternatives:
- Use names purely deterministic, based on PCI paths (for example
"enp4s0f1"), MAC address or other of the alternatives, which would be
"definitive", but given that we have a diversity of hardware and VM
installations in customers the devices in different systems would be
different, and the fact that it would be easier to mistype or confuse
them makes this not ideal.
- Use names purely based on functionality, like for example "ha0",
"ext0" or "int0". The problem in this case is that we would have to
find names that would satisfy everyone (and there's no time for doing
this at this point), that different of our system types are quite
different (e.g. Pro without bonds, Carrier with bonds and many vlans
by default; using the same hardware), and some customers with
different installations or needs (e.g. using VMs) have also totally
different network configuration -- so any attempt to unify this to
make good use of the functionality-based names would be very
challenging.
- Finally, there's the option to use some symbolic names similar to
traditional names like "eth0", but without being exactly this.
Popular names in general, although there's no wide consensus, are
names like "net0" and "lan0".
Talking with groups involved in deploying and maintaining the system,
the decision was taken to move to names not purely deterministic, and
there's no time for purely symbolic (they also didn't express much
interest on them), and prefer something more traditional that they are
already used too. Instead of names like "net0" or "lan0", they prefer
the more direct mapping to existing interfaces like "neth0".
This is ugly or slighly discomforting to use for some, but since the
main users (among us) of these names prefer them, so be it. It has the
advantage of having a very simple and mechanichal translation based on
the current names, which is an advantage especially at the critical time
of upgrading existing systems to the new name.
Change-Id: I4a168c7d81e40f609749f77a509d2acb72d3a9d3
This is commit cd50e4934c applied again.
As explained in ab62171c49, the original
change had to be reverted because even if things work perfectly fine, in
the case of Vagrant machines (or when passing "vagrant" parameter to the
script) the udev-rules for persistent-net devices get removed, so then
the network interfaces get "random" names and the configuration in
/etc/network/interfaces doesn't match, the network is not brought up.
This removal happens in the case of {ce,pro,carrier}-trunk.mgm machines
of our tests, which shouldn't be needed, and also in the images created
for Vagrant machines, which is understandable because the machines could
be brought up with different PCIIDs in different versions of VirtualBox,
or due to some other difference -- not sure how we can ensure that the
PCIIDs as written in the udev-rules files will work in that case.
But in principle this change must go ahead when we solve these problems,
so submitting it again to be ready.
Change-Id: Ib39481a2608aa56e6ec6c9255e290787a6ce3af7
Run the installer under "eatmydata" to speed up the process. Also add
some more information about timing.
In some VMs that we install daily ({ce,pro,carrier}-trunk.mgm) we have
the following timings:
ce-runner, no eatmydata:
162 seconds, 2 mins 42 secs
ce-runner, with eatmydata:
142 seconds, 2 mins 22 secs
pro-runner, no eatmydata:
246 seconds, 4 mins 06 secs
pro-runner, with eatmydata:
217 seconds, 3 mins 37 secs
So in these machines, for CE we save about 20 seconds, which is not much
in total but it's about 12.5% saving; and in Pro about 30 seconds (and
twice, once per machine, so about a minute in total), which is about
12.2% as well.
In Carrier, which is mostly equivalent to Pro in this respect and
typically at least 8 machines, it would mean about 4 mins in total.
When installing in hardware in previous days, maybe due to the disks
being slower, the total installation time was slightly slower:
pro-hardware (Lenovo ThinkSystem SR250), with eatmydata:
226 seconds, 3 mins 46 secs
Installing without eatmydata was not measured yet in hardware, but given
that the time to install is similar to the case of pro-runner, probably
the performance gain is similar too.
This looks like a relevant saving, the risk of things going wrong are
minimal, so enable it by default.
Change-Id: I8267fad08ff337c02801fb8fad0433d9b6d9f4c2
This reverts commit cd50e4934c.
In principle this works fine when using
/etc/udev/rules.d/70-persistent-net.rules, but it turns out that in the
test infrastructure (including {ce,pro,carrier}-trunk.mgm machines and
build-matrix) we remove the generated rules in many places:
if $VAGRANT; then
...
# MACs are different on buildbox and on local VirtualBox
# see http://ablecoder.com/b/2012/04/09/vagrant-broken-networking-when-packaging-ubuntu-boxes/
echo "Removing '${TARGET_UDEV_PERSISTENT_NET_RULES}'"
rm -f "${TARGET_UDEV_PERSISTENT_NET_RULES:?}"
So in this way, the interfaces that we get are ens18 in our infra for
{ce,pro,carrier}-trunk.mgm machines, and so the generated
/etc/network/interfaces usint the fixed names "eth*" (in process to be
renamed "neth*") cannot be found in those systems, and all
build-install-vm jobs fail.
In a local vagrant machine (ce-trunk from just before the change) we
have names like these for the network devices:
root@spce:~# dmesg | grep rename
[ 2.051263] e1000 0000:00:09.0 enp0s9: renamed from eth1
[ 2.065876] e1000 0000:00:03.0 enp0s3: renamed from eth0
[ 3.950540] e1000 0000:00:03.0 eth0: renamed from enp0s3
[ 4.049842] e1000 0000:00:09.0 eth1: renamed from enp0s9
In this boot session from which the logs above are taken, was booted
with grub without "net.ifnames=0", and udev "70-persistent-net.rules"
generated in place with the right infromation, and then of course things
work fine.
So we need some solution this before moving on with the change now
reverted.
Change-Id: I25d3b9c175b92214670ebb63a7916b60e0e4e5f9
Current trunk installations based on bullseye using recent Grml
environments are broken, as EFI environments running with recent kernel
versions (>=5.10) aren't properly detected anymore.
This is caused by the missing efivars kernel module.
CONFIG_EFI_VARS is no longer available since
20146398c4
(tagged initially as debian/5.10.1-1_exp1 + shipped with kernel package
5.10.1-1~exp1 and newer, incl. 5.10.38-1 as present in current
Debian/unstable). Therefore the kernel module efivars is no longer
available on more recent Debian kernel systems.
Quoting from https://wiki.debian.org/UEFI:
| The older interface was efivars, showing files under
| /sys/firmware/efi/vars, and this is what was used by default in both
| Wheezy and Jessie.
|
| The new interface is efivarfs, which will expose things in a slightly
| different format under /sys/firmware/efi/efivars. This is the new
| preferred way of using UEFI configuration variables, and Debian switched
| to it by default from Stretch onwards.
CONFIG_EFI_VARS is no longer required, instead efivarfs seems to be
available starting with kernel v3.10 and newer (see linux.git):
| commit a9499fa7cd3fd4824a7202d00c766b269fa3bda6
| Author: Tom Gundersen teg@jklm.no
| Date: Fri Feb 8 15:37:06 2013 +0000
|
| efi: split efisubsystem from efivars
|
| This registers /sys/firmware/efi/{,systab,efivars/} whenever EFI is enabled
| and the system is booted with EFI.
|
| This allows
| *) userspace to check for the existence of /sys/firmware/efi as a way
| to determine whether or it is running on an EFI system.
| *) 'mount -t efivarfs none /sys/firmware/efi/efivars' without manually
| loading any modules.
|
| [ Also, move the efivar API into vars.c and unconditionally compile it.
| This allows us to move efivars.c, which now only contains the sysfs
| variable code, into the firmware/efi directory. Note that the efivars.c
| filename is kept to maintain backwards compatability with the old
| efivars.ko module. With this patch it is now possible for efivarfs
| to be built without CONFIG_EFI_VARS - Matt ]
and:
| commit d68772b7c83f4b518be15ae96f4827c8ed02f684
| Author: Matt Fleming matt.fleming@intel.com
| Date: Fri Feb 8 16:27:24 2013 +0000
|
| efivarfs: Move to fs/efivarfs
|
| Now that efivarfs uses the efivar API, move it out of efivars.c and
| into fs/efivarfs where it belongs. This move will eventually allow us
| to enable the efivarfs code without having to also enable
| CONFIG_EFI_VARS built, and vice versa.
|
| Furthermore, things like,
|
| mount -t efivarfs none /sys/firmware/efi/efivars
|
| will now work if efivarfs is built as a module without requiring the
| use of MODULE_ALIAS(), which would have been necessary when the
| efivarfs code was part of efivars.c.
But we also need to ensure /sys/firmware/efi/efivars is mounted,
otherwise efibootmgr fails to execute:
| # efibootmgr
| EFI variables are not supported on this system.
| # lsmod| grep efi
| efi_pstore 16384 0
| efivarfs 16384 1
| # mount -t efivarfs none /sys/firmware/efi/efivars
| # efibootmgr
| BootCurrent: 0002
| Timeout: 3 seconds
| BootOrder: 0001,0002,0003,0000,0004
| Boot0000* UiApp
| Boot0001* UEFI QEMU QEMU HARDDISK
| Boot0002* UEFI PXEv4 (MAC:02B31C8CA0AA)
| Boot0003* UEFI PXEv4 (MAC:92097BD02A48)
| Boot0004* EFI Internal Shell
FTR: we can't test only for existence of directory
/sys/firmware/efi/efivars, as it exists but is empty by default, so we
need to look inside the directory instead.
See https://github.com/grml/grml-debootstrap/pull/174 for the related
grml-debootstrap upstream change, which is supposed to be released as of
grml-debootstrap v0.97.
But as a) grml-debootstrap v0.97 isn't released yet, b) it's unclear
whether grml-debootstrap v0.97 will make it into bullseye (soonish, or
if at all) and c) we don't have the Grml repositories available via our
approx Debian mirror (as used in our PRO/Carrier environments) and don't
want to update our Grml squashfs system for this change neither, we need
to apply a workaround for this efivars vs efivarfs situation. Otherwise
Debian installation fails in EFI environments using Debian kernel
>=5.10. Thankfully we can work around this using according pre/post
scripts in grml-debootstrap, that's what efivars_workaround() is all
about.
Thanks: Manuel Montecelo <mmontecelo@sipwise.com> for the initial patch and Volodymyr Fedorov <vfedorov@sipwise.com> for underlying research
Change-Id: I5374322cb0a39cfed6563df6c4c30f1eafe560c1
The "Building database of manual pages ..." of mandb(8) is invoked
during Debian package installations, and takes a considerable amount of
time[1]. By disabling this, we can speed up our installation process,
similar to what we already do with all our build environments.
If someone really needs the man-db database (for apropos(1) or
whatis(1) usage), then invoking `systemctl restart man-db.service`
provides that on demand.
FTR: there are also /etc/cron.daily/man-db + /etc/cron.weekly/man-db,
though they don't do anything when running under systemd. There's also
man-db.timer, though we don't have it enabled by default on our NGCP
systems.
[1] Demo from a running PRO system:
| root@sp2:~# rm -rf /var/cache/man
| root@sp2:~# time systemctl restart man-db.service
|
| real 1m18.357s
| user 0m0.000s
| sys 0m0.009s
Change-Id: If98007860490adc5ad954e8c36000abd7281931b
Add options to install bullseye in all places where buster is used, use
it as default when possible, and keep these for the moment.
Switch to bullseye in Dockerfile.
Change-Id: I2f693982ba92a671a6f2254c5a245a1d05231404
The call:
UTS_RELEASE="${KERNELVERSION}" LD_PRELOAD="${FAKE_UNAME}" \
grml-chroot "${TARGET}" /media/cdrom/VBoxLinuxAdditions.run --nox11
fails with:
Running in chroot, ignoring request: daemon-reload
Before 8a54cd1374 it was skipped so hide
it with '|| true'.
Use 'grml-chroot' instead of 'chroot' as 'grml-chroot' is a wrapper
which also cares about required mountpoints.
Use single style for "${TARGET}" variable.
Change-Id: Icc625c9a58b114f62350fc1e540ddac8a4147f28
Quoting from "man bash" about `-E` (AKA errtrace):
| If set, any trap on ERR is inherited by shell functions, command
| substitutions, and commands executed in a subshell environment.
| The ERR trap is normally not inherited in such cases.
To demonstrate the problem see this short shell script:
| % cat foo
| set -eu -o pipefail
|
| bailout() {
| echo "Bailing out because of error" >&2
| exit 1
| }
| trap bailout 1 2 3 6 9 14 15 ERR
|
| foo() {
| echo "Executing magic"
| magic
| }
|
| foo
| echo end
If "magic" can't be executed, then this fails as follows:
| % bash ./foo
| Executing magic
| ./foo: line 11: magic: command not found
But it doesn't invoke the bailout function via trap.
When using `set -eE` (AKA errexit + errtrace), instead of only
`set -e` (errexit), then it behaves as expected though:
| % bash ./foo
| Executing magic
| ./foo: line 11: magic: command not found
| Bailing out because of error
Change-Id: I26396b87d4a391a75997c061e866709daa57870e
grub-pc >=2.04-11 has a new behavior regarding /boot/grub/i386-pc/
handling, where we end up with an empty /boot/grub/i386-pc/ after
*successful* grub-install execution:
| root@grml ~ # vgchange -ay
| 3 logical volume(s) in volume group "ngcp" now active
| root@grml ~ # mount /dev/mapper/ngcp-root /mnt
| root@grml ~ # grml-chroot /mnt /bin/bash
| Writing /etc/debian_chroot ...
| (spce)root@grml:/# cd
| (spce)root@grml:~# grub-install /dev/sda
| Installing for i386-pc platform.
| Installation finished. No error reported.
| (spce)root@grml:~# ls -la /boot/grub/i386-pc/
| total 16
| drwxr-xr-x 2 root root 12288 Dec 16 12:04 .
| drwxr-xr-x 4 root root 4096 Dec 16 12:07 ..
This causes the installed system to fail to boot with:
| GRUB loading..
| Welcome to GRUB!
|
| error: file `/boot/grub/i386-pc/normal.mod' not found.
| grub rescue> _
The underlying issue is that recent grub versions unlink the files
inside /boot/grub/i386-pc, though it doesn't report anything about it
(even under `--verbose` execution).
This is triggered in our situation, as lvm2's vgs binary isn't present
yet. In earlier versions of grub this wasn't causing any problems and
grub-install happily installed the files inside /boot/grub/i386-pc, even
though we installed lvm2 only afterwards via our metapackages. To
ensure lvm2 is available during installation time within
grml-debootstrap, explicitly add to it list of packages to be installed.
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=977544 for further
details regarding the grub bug.
Change-Id: I27a1cd18777526eb26b838fae88d4d87b6e93467
We install virtualbox-guest-additions in the target system for usage
with VirtualBox and shared folders via Vagrant. We invoke the
VBoxLinuxAdditions.run machinery from the running Grml live system. But
the target systems usually has a different kernel package and version
installed, so we have to apply some tricks to get it working. This is
where we rely on fake-uname.so.
Since commit a91baa2 (TT#48647 Ship fake_uname lib in package) we're
relying on fake-uname.so from ngcp-deployment-scripts, instead of
building and shipping it via deployment.sh itself.
But we have ngcp-deployment-scripts available only when installing NGCP
- as we're installing it there and only afterwards invoke
vagrant_configuration() - whereas it's missing when we install a plain
Debian system (like with our debian_bullseye_plain_vagrant.box),
therefore failing with:
| cp: cannot stat '/mnt/usr/lib/ngcp-deployment-scripts/fake-uname.so': No such file or directory
| ERROR: ld.so: object '/usr/lib/ngcp-deployment-scripts/fake-uname.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
Change-Id: I639a43c3deafd2fc188350936e15f48482103209
The ensure_packages_installed function ensures that specified packages
are present during runtime. This is used e.g. for installation of
virtualbox-guest-additions-iso Debian package from within
vagrant_configuration(), which is used to execute
/media/cdrom/VBoxLinuxAdditions.run inside the target system.
We can't use random Debian repositories though, as the package
dependencies need to match the running live system. So far we only used
the buster repository, as our current grml-sipwise ISOs are based on
something close to buster.
On the other hand we can't use virtualbox-guest-additions-iso from
Debian/buster in our Debian/bullseye Vagrant boxes, as
/sbin/mount.vboxsf doesn't work then.
So use the bullseye repository if the release of the target system is
bullseye, which seems to work with our current Grml ISOs and current
state of bullseye.
Change-Id: Iaf965daa6ff7a62e2b3bd8c55b8f761abd94c241
Nowadays we only deploy stretch + buster based Debian systems, so drop
those release specific checks to also support bullseye and newer Debian
releases.
Change-Id: Ibf3d1527ccaeba60526a730e6886e6521c08d20e
The /usr/bin/python symlink/binary no longer exists in recent
Grml-Sipwise ISOs and python3 doesn't ship SimpleHTTPServer but
http.server instead.
Change-Id: I6677e8a416b142034d99d5b1d2b11ba74d87a6ec
No need to install this package to non-vagrant system.
Do not add this package to Sipwise-grml image - it's too heavy (86M)
and not needed on real systems.
Change-Id: I9ec9ff76d588f4ced30ba199f05bb167eec5288a
Previously the functions and the code were fixed so it was hard to
understand the flow of execution.
Move all functions to a single section at the beginning of the script
so it's easier to understand the code.
Fix shellcheck warnings:
SC2086: Double quote to prevent globbing and word splitting.
SC2154: efidev1 is referenced but not assigned.
Change-Id: Ie4bf28c166e4a9ff236eff807ee97adae6ecddd0
This variable is used in some ngcpcfg *.services file.
Specifically 'ngcp-provisioning-tools/mysql_values.cfg.services' uses
this variable to specify mysql db host and if it's necessary to reload
kamailio.
Change-Id: Ibf137cdd0ad6f6492a30cfa715c468e4ac22832f