master
mr13.3
mr13.3.1
mr13.2
mr13.2.1
mr13.1
mr13.1.1
mr13.0
mr13.0.1
mr10.5
mr10.5.8
mr12.5.1
mr12.5
mr9.5
mr9.5.9
mr12.3.1
mr12.4.1
mr12.3
mr12.4
mr11.3
mr10.5.7
mr12.2
mr12.2.1
mr12.1
mr12.1.1
mr8.5
mr8.5.12
mr10.5.6
mr12.0
mr12.0.1
mr9.5.8
mr11.5
mr11.5.1
mr10.5.5
mr11.4.1
mr11.4
mr8.5.11
mr9.5.7
mr11.3.1
mr10.5.4
mr11.2
mr11.2.1
mr10.5.3
mr8.5.10
mr9.5.6
mr11.1
mr11.1.1
mr10.5.2
mr11.0
mr11.0.1
mr7.5
mr7.5.13
mr10.5.1
mr9.5.5
mr8.5.9
mr7.5.12
mr10.4
mr10.4.1
mr8.5.8
mr10.3
mr9.5.4
mr10.3.1
mr7.5.11
mr9.5.3
mr10.2
mr10.2.1
mr8.5.7
mr6.5
mr6.5.13
mr10.1
mr10.1.1
mr8.5.6
mr7.5.10
mr8.5.5
mr9.5.2
mr10.0.1
mr10.0
mr9.1.1
mr9.5.1
mr7.5.9
mr9.1
mr9.4
mr9.4.1
mr8.5.4
mr7.5.8
mr6.5.12
mr7.5.1
mr7.5.4
mr7.5.3
mr8.5.1
mr7.5.2
mr7.5.6
mr7.5.5
mr8.5.2
mr9.3
mr9.3.1
mr8.5.3
mr7.5.7
mr9.2
mr9.2.1
mr6.5.11
legacy_releases_before_mr6.2
mr9.0
mr9.0.1
mr6.5.10
mr8.4
mr8.4.2
mr8.3
mr8.3.2
mr8.4.1
mr6.5.9
mr8.2
mr8.2.2
mr8.3.1
mr6.5.8
mr8.1
mr8.1.2
mr6.5.7
mr6.5.6
mr7.4.1
mr7.4.2
mr6.2.1
mr6.2.2
mr6.3.1
mr6.3.2
mr6.4.1
mr6.4.2
mr6.5.1
mr6.5.2
mr6.5.3
mr6.5.4
mr6.5.5
mr7.0.1
mr7.0.2
mr7.1.1
mr7.1.2
mr7.2.1
mr7.2.2
mr7.3.1
mr7.3.2
mr8.2.1
mr8.0
mr8.0.2
mr8.1.1
mr8.0.1
mr7.4
mr7.3
mr7.2
mr7.1
mr7.0
mr6.4
mr6.3
mr6.2
mr10.0.1.1
mr10.0.1.2
mr10.1.1.1
mr10.2.1.1
mr10.3.1.1
mr10.4.1.1
mr10.5.1.1
mr10.5.2.1
mr10.5.3.1
mr10.5.4.1
mr10.5.5.1
mr10.5.6.1
mr10.5.7.1
mr10.5.8.1
mr11.0.1.1
mr11.1.1.1
mr11.2.1.1
mr11.3.1.1
mr11.4.1.1
mr11.4.1.2
mr11.5.1.1
mr12.0.1.1
mr12.1.1.1
mr12.2.1.1
mr12.3.1.1
mr12.3.1.2
mr12.4.1.1
mr12.4.1.2
mr12.5.1.1
mr12.5.1.2
mr13.0.1.1
mr13.1.1.1
mr13.2.1.1
mr13.3.1.1
mr6.2.1.1
mr6.2.1.2
mr6.2.2.1
mr6.2.2.2
mr6.3.1.1
mr6.3.1.2
mr6.3.2.1
mr6.3.2.2
mr6.4.1.1
mr6.4.1.2
mr6.4.1.3
mr6.4.2.1
mr6.4.2.2
mr6.5.1.1
mr6.5.1.2
mr6.5.10.1
mr6.5.11.1
mr6.5.12.1
mr6.5.13.1
mr6.5.2.1
mr6.5.2.2
mr6.5.3.1
mr6.5.3.2
mr6.5.4.1
mr6.5.4.2
mr6.5.5.1
mr6.5.5.2
mr6.5.6.1
mr6.5.6.2
mr6.5.6.3
mr6.5.6.4
mr6.5.7.1
mr6.5.7.2
mr6.5.8.1
mr6.5.9.1
mr7.0.1.1
mr7.0.1.2
mr7.0.2.1
mr7.0.2.2
mr7.1.1.1
mr7.1.1.2
mr7.1.2.1
mr7.1.2.2
mr7.2.1.1
mr7.2.1.2
mr7.2.2.1
mr7.2.2.2
mr7.3.1.1
mr7.3.1.2
mr7.3.1.3
mr7.3.2.1
mr7.3.2.2
mr7.4.1.1
mr7.4.1.2
mr7.4.2.1
mr7.4.2.2
mr7.5.1.1
mr7.5.1.2
mr7.5.1.3
mr7.5.10.1
mr7.5.10.2
mr7.5.11.1
mr7.5.12.1
mr7.5.13.1
mr7.5.2.1
mr7.5.2.2
mr7.5.3.1
mr7.5.3.2
mr7.5.4.1
mr7.5.4.2
mr7.5.5.1
mr7.5.5.2
mr7.5.6.1
mr7.5.6.2
mr7.5.7.1
mr7.5.7.2
mr7.5.8.1
mr7.5.9.1
mr8.0.1.1
mr8.0.1.2
mr8.0.2.1
mr8.1.1.1
mr8.1.2.1
mr8.2.1.1
mr8.2.2.1
mr8.3.1.1
mr8.3.2.1
mr8.4.1.1
mr8.4.2.1
mr8.5.1.1
mr8.5.1.2
mr8.5.1.3
mr8.5.10.1
mr8.5.11.1
mr8.5.12.1
mr8.5.2.1
mr8.5.2.2
mr8.5.3.1
mr8.5.3.2
mr8.5.4.1
mr8.5.5.1
mr8.5.5.2
mr8.5.6.1
mr8.5.7.1
mr8.5.8.1
mr8.5.9.1
mr9.0.1.1
mr9.1.1.1
mr9.1.1.2
mr9.2.1.1
mr9.3.1.1
mr9.4.1.1
mr9.5.1.1
mr9.5.2.1
mr9.5.3.1
mr9.5.4.1
mr9.5.5.1
mr9.5.6.1
mr9.5.7.1
mr9.5.8.1
mr9.5.9.1
${ noResults }
225 Commits (236cb2d1a76624c8ac9446470e07f318299b9298)
Author | SHA1 | Message | Date |
---|---|---|---|
|
236cb2d1a7 |
MT#58926 Vagrant: ensure to have libxmu6 available
We get the following error message in /var/log/vboxadd-install.log, /var/log/deployment-installer-debug.log, /var/log/daemon.log + /var/log/syslog: | /opt/VBoxGuestAdditions-7.0.6/bin/VBoxClient: error while loading shared libraries: libXmu.so.6: cannot open shared object file: No such file or directory This is caused by missing libxmu6: | [sipwise-lab-trunk] sipwise@spce:~$ /opt/VBoxGuestAdditions-7.0.6/bin/VBoxClient --help | /opt/VBoxGuestAdditions-7.0.6/bin/VBoxClient: error while loading shared libraries: libXmu.so.6: cannot open shared object file: No such file or directory | [sipwise-lab-trunk] sipwise@spce:~$ sudo apt install libxmu6 | Reading package lists... Done | Building dependency tree... Done | Reading state information... Done | The following NEW packages will be installed: | libxmu6 | 0 upgraded, 1 newly installed, 0 to remove and 83 not upgraded. | Need to get 60.1 kB of archives. | After this operation, 143 kB of additional disk space will be used. | Get:1 https://debian.sipwise.com/debian bookworm/main amd64 libxmu6 amd64 2:1.1.3-3 [60.1 kB] | Fetched 60.1 kB in 0s (199 kB/s) | [...] | [sipwise-lab-trunk] sipwise@spce:~$ /opt/VBoxGuestAdditions-7.0.6/bin/VBoxClient --help | Oracle VM VirtualBox VBoxClient 7.0.6 | Copyright (C) 2005-2023 Oracle and/or its affiliates | | Usage: VBoxClient --clipboard|--draganddrop|--checkhostversion|--seamless|--vmsvga|--vmsvga-session | [-d|--nodaemon] | | Options: | [...] It looks like lack of libxmu6 doesn't cause any actual problems for our use case (we don't use X.org at all), though given that libxmu6 is a small library package, let's try to get it working as expected and avoid the alarming errors on the logs. Thanks Guillem Jover for spotting and reporting Change-Id: I65f3dd496a4026f04fd9944fd7cc43d6abbdf336 |
1 year ago |
|
8c3ab6b241 |
MT#57559 Always include zstd when bootstrapping systems
During initial deployment of a system, we get warnings about lack of zstd: | Setting up linux-image-6.1.0-13-amd64 (6.1.55-1) ... | I: /vmlinuz.old is now a symlink to boot/vmlinuz-6.1.0-13-amd64 | I: /initrd.img.old is now a symlink to boot/initrd.img-6.1.0-13-amd64 | I: /vmlinuz is now a symlink to boot/vmlinuz-6.1.0-13-amd64 | I: /initrd.img is now a symlink to boot/initrd.img-6.1.0-13-amd64 | /etc/kernel/postinst.d/initramfs-tools: | update-initramfs: Generating /boot/initrd.img-6.1.0-13-amd64 | W: No zstd in /usr/bin:/sbin:/bin, using gzip | [...] The initramfs generation and update overall runs *four* times within the initial bootstrapping of a system (we'll try to do something about this, but this is outside the scope of this). As of initramfs-tools v0.141, initramfs-tools uses zstd as default compression for initramfs. Version 0.142 is shipped with Debian/bookworm, and therefore it makes sense to have it available upfront. Note that also the initrd generation is faster with zstd (~10sec for zstd vs. ~13sec for gzip) and also the resulting initrd is smaller (~33MB for zstd vs ~39MB for gzip). By making sure that zstd is available straight from the very beginning and before ngcp-installer pulls it in later, we can avoid the warning message but also save >10 seconds of install time. Given that zstd is available even in Debian oldoldstable, let's install it unconditionally in all our systems. Thanks: Volodymyr Fedorov for reporting Change-Id: I56674c3c213f7c7a6e6cbce3c8e2e00a4cfbdbd4 |
1 year ago |
|
9cceb8d655 |
MT#58356 ntp: Use ntpsec.service instead of ntp.service
Even though the ntpsec.service contains an Alias for ntp.service, that does not work for us when the service has not yet been installed, so the first run will fail. Use the actual name to avoid this issue. Change-Id: I8f0ee3b38390a7e58c3bbee65fd96bfd4b717dfa |
2 years ago |
|
793a93bc43 |
MT#57453 vagrant_configuration: remove fake systemd presence after execution
Let's restore system state of /run/systemd/system for
VBoxLinuxAdditions, to avoid any unexpected side effects.
Followup for git rev
|
2 years ago |
|
561303359e |
MT#57453 Use tty1 for stdin when running under grml-autoconfig service
Recent Grml ISOs, including our Grml-Sipwise ISO (v2023-06-01), include
grml-autoconfig v0.20.3 which execute the grml-autoconfig service under
`StandardInput=null`. This is necessary to not conflict with tty usage,
like used with serial console. See
|
2 years ago |
|
8601193128 |
MT#57453 vagrant_configuration: fake systemd presence
As of git rev
|
2 years ago |
|
6c960afee4 |
TT#104221 Use bookworm repos in ensure_packages_installed appropriately
Support bookworm option in DEBIAN_RELEASE selection. We have support for it already. Use bookworm as fallback since nowadays we jumped to it. Change-Id: I118c1b5cf81fe57394495b5f745fc81032406c78 |
2 years ago |
|
37163532ee |
MT#56773 Use bullseye puppetlabs repository for bookworm
To be able to upgrade our internal systems to Debian/bookworm we need to have puppet packages available. Upstream still doesn't provide any Debian packages (see https://tickets.puppetlabs.com/browse/PA-4995), though their AIO (All In One) packages for Debian/bullseye seem to be working on Debian/bookworm as well (at least for puppet-agent). So until we either migrated to puppet-agent as present in Debian/bookworm or upstream provides according AIO packages, let's use the puppet-agent packages we already use for our Debian/bullseye systems. Change-Id: I2211ffd79f70a2a79873e737b0b512bfb7492328 |
2 years ago |
|
0fedba6144 |
MT#57643 Ensure /var/lib/dpkg/available exists on Debian releases <=buster
Since version 1.20.0, dpkg no longer creates /var/lib/dpkg/available (see #647911). Now that we upgraded our Grml-Sipwise deployment system to bookworm, we have dpkg v1.21.22 on our live system, and mmdebstrap relies on dpkg of the host system for execution. But on Debian releases until and including buster, dpkg fails to operate with e.g. `dpkg --set-selections`, if /var/lib/dpkg/available doesn't exist: | The following NEW packages will be installed: | nullmailer | [...] | debconf: delaying package configuration, since apt-utils is not installed | dpkg: error: failed to open package info file '/var/lib/dpkg/available' for reading: No such file or directory We *could* also switch from mmdebstrap to debootstrap for deploying Debian releases <=buster, but this would be slower and we use mmdebstrap since quite some time for everything. So instead let's create /var/lib/dpkg/available after bootstrapping the system. Reported towards mmdebstrap as #1037946. Change-Id: I0a87ca255d5eb7144a9c093051c0a6a3114a3c0b |
2 years ago |
|
eccdc586ae |
MT#57644 puppet/git: allow ssh-rsa pubkey usage
Now that our deployment system is based on Debian/bookworm, but our gerrit/git server still runs on Debian/bullseye, we run into the OpenSSH RSA issue (RSA signatures using the SHA-1 hash algorithm got disabled by default), see https://michael-prokop.at/blog/2023/06/11/what-to-expect-from-debian-bookworm-newinbookworm/ and https://www.jhanley.com/blog/ssh-signature-algorithm-ssh-rsa-error/ We need to enable ssh-rsa usage, otherwise deployment fails with: | Warning: Permanently added '[gerrit.mgm.sipwise.com]:29418' (ED25519) to the list of known hosts. | sign_and_send_pubkey: no mutual signature supported | puppet-r10k@gerrit.mgm.sipwise.com: Permission denied (publickey). | fatal: Could not read from remote repository. Change-Id: I5894170dab033d52a2612beea7b6f27ab06cc586 |
2 years ago |
|
8cfb8c8392 |
MT#57630 Check online connectivity to work around Intel E810 / ice issue
Deploying the Debian/bookworm based NGCP system fails on a Lenovo sr250 v2 node with an Intel E810 network card: | # lshw -c net -businfo | Bus info Device Class Description | ======================================================= | pci@0000:01:00.0 eth0 network Ethernet Controller E810-XXV for SFP | pci@0000:01:00.1 eth1 network Ethernet Controller E810-XXV for SFP | # lshw -c net | *-network:0 | description: Ethernet interface | product: Ethernet Controller E810-XXV for SFP | vendor: Intel Corporation | physical id: 0 | bus info: pci@0000:01:00.0 | logical name: eth0 | version: 02 | serial: [...] | size: 10Gbit/s | capacity: 25Gbit/s | width: 64 bits | clock: 33MHz | capabilities: pm msi msix pciexpress vpd bus_master cap_list rom ethernet physical fibre 1000bt-fd 25000bt-fd | configuration: autonegotiation=off broadcast=yes driver=ice driverversion=1.11.14 duplex=full firmware=2.25 0x80007027 1.2934.0 ip=192.168.90.51 latency=0 link=yes multicast=yes port=fibre speed=10Gbit/s | resources: iomemory:400-3ff iomemory:400-3ff irq:16 memory:4002000000-4003ffffff memory:4006010000-400601ffff memory:a1d00000-a1dfffff memory:4005000000-4005ffffff memory:4006220000-400641ffff We set up the /etc/network/interfaces file by invoking Grml's netcardconfig script in automated mode, like: NET_DEV=eth0 METHOD=static IPADDR=192.168.90.51 NETMASK=255.255.255.248 GATEWAY=192.168.90.49 /usr/sbin/netcardconfig The resulting /etc/network/interfaces gets used as base for usage inside the NGCP chroot/target system. netcardconfig shuts down the network interface (eth0 in the example above) via ifdown, then sleeps for 3 seconds and re-enables the interface (via ifup) with the new configuration. This used to work fine so far, but with the Intel e810 network card and kernel version 6.1.0-9-amd64 from Debian/bookworm we see a link failure and it takes ~10 seconds until the network device is up and running again. The following vagrant_configuration() execution from deployment.sh then fails: | +11:41:01 (netscript.grml:1022): vagrant_configuration(): wget -O /var/tmp/id_rsa_sipwise.pub http://builder.mgm.sipwise.com/vagrant-ngcp/id_rsa_sipwise.pub | --2023-06-11 11:41:01-- http://builder.mgm.sipwise.com/vagrant-ngcp/id_rsa_sipwise.pub | Resolving builder.mgm.sipwise.com (builder.mgm.sipwise.com)... failed: Name or service not known. | wget: unable to resolve host address 'builder.mgm.sipwise.com' However, when we retry it again just a bit later, the network works fine again. During investigation we identified that the network card flips the port, quoting the related log from the connected Cisco nexus 5020 switch (with fast stp learning mode): | nexus5k %ETHPORT-5-IF_DOWN_LINK_FAILURE: Interface Ethernet1/33 is down (Link failure) It seems to be related to some autonegotiation problem, as when we execute `ethtool -A eth0 rx on tx on` (no matter whether with `on` or `off`), we see: | [Tue Jun 13 08:51:37 2023] ice 0000:01:00.0 eth0: Autoneg did not complete so changing settings may not result in an actual change. | [Tue Jun 13 08:51:37 2023] ice 0000:01:00.0 eth0: NIC Link is Down | [Tue Jun 13 08:51:45 2023] ice 0000:01:00.0 eth0: NIC Link is up 10 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: NONE, Autoneg Advertised: On, Autoneg Negotiated: False, Flow Control: Rx/Tx FTR: | root@sp1 ~ # ethtool -A eth0 autoneg off | netlink error: Operation not supported | 76 root@sp1 ~ # ethtool eth0 | grep -C1 Auto-negotiation | Duplex: Full | Auto-negotiation: off | Port: FIBRE | root@sp1 ~ # ethtool -A eth0 autoneg on | root@sp1 ~ # ethtool eth0 | grep -C1 Auto-negotiation | Duplex: Full | Auto-negotiation: off | Port: FIBRE | root@sp1 ~ # dmesg -T | tail -1 | [Tue Jun 13 08:53:26 2023] ice 0000:01:00.0 eth0: To change autoneg please use: ethtool -s <dev> autoneg <on|off> | root@sp1 ~ # ethtool -s eth0 autoneg off | root@sp1 ~ # ethtool -s eth0 autoneg on | netlink error: link settings update failed | netlink error: Operation not supported | 75 root@sp1 ~ # As a workaround, at least until we have a better fix/solution, we try to reach the default gateway (or fall back to the repository host if gateway couldn't be identified) via ICMP/ping, and once that works we we continue as usual. But even if that should fail we continue execution, to minimize behavior change but have a workaround for this specific situation available. FTR, broken system: | root@sp1 ~ # ethtool -i eth0 | driver: ice | version: 6.1.0-9-amd64 | firmware-version: 2.25 0x80007027 1.2934.0 | [...] Whereas with kernel 5.10.0-23-amd64 from Debian/bullseye we don't seem to see that behavior: | root@sp1:~# ethtool -i neth0 | driver: ice | version: 5.10.0-23-amd64 | firmware-version: 2.25 0x80007027 1.2934.0 | [...] Also using latest available ice v1.11.14 (from https://sourceforge.net/projects/e1000/files/ice%20stable/1.11.14/) on Kernel version 6.1.0-9-amd64 doesn't bring any change: | root@sp1 ~ # modinfo ice | filename: /lib/modules/6.1.0-9-amd64/updates/drivers/net/ethernet/intel/ice/ice.ko | firmware: intel/ice/ddp/ice.pkg | version: 1.11.14 | license: GPL v2 | description: Intel(R) Ethernet Connection E800 Series Linux Driver | author: Intel Corporation, <linux.nics@intel.com> | srcversion: 818E9C817731C98A25470C0 | alias: pci:v00008086d00001888sv*sd*bc*sc*i* | [...] | alias: pci:v00008086d00001591sv*sd*bc*sc*i* | depends: ptp | retpoline: Y | name: ice | vermagic: 6.1.0-9-amd64 SMP preempt mod_unload modversions | parm: debug:netif level (0=none,...,16=all) (int) | parm: fwlog_level:FW event level to log. All levels <= to the specified value are enabled. Values: 0=none, 1=error, 2=warning, 3=normal, 4=verbose. Invalid values: >=5 | (ushort) | parm: fwlog_events:FW events to log (32-bit mask) | (ulong) | root@sp1 ~ # ethtool -i eth0 | head -3 | driver: ice | version: 1.11.14 | firmware-version: 2.25 0x80007027 1.2934.0 | root@sp1 ~ # Change-Id: Ieafe648be4e06ed0d936611ebaf8ee54266b6f3c |
2 years ago |
|
f4da3e094e |
MT#57049 Ensure SW-RAID device is inactive before re-reading partition table
Re-reading of disks fails if the mdadm SW-RAID device is still active: | root@sp1 ~ # cat /proc/mdstat | Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] | md0 : active raid1 sdb3[1] sda3[0] | 468218880 blocks super 1.2 [2/2] [UU] | [========>............] resync = 42.2% (197855168/468218880) finish=22.4min speed=200756K/sec | bitmap: 3/4 pages [12KB], 65536KB chunk | | unused devices: <none> | root@sp1 ~ # blockdev --rereadpt /dev/sdb | blockdev: ioctl error on BLKRRPART: Device or resource busy | 1 root@sp1 ~ # blockdev --rereadpt /dev/sda | blockdev: ioctl error on BLKRRPART: Device or resource busy | 1 root@sp1 ~ # Only if we stop the mdadm SW-RAID device, then we can re-read the partition table: | root@sp1 ~ # mdadm --stop /dev/md0 | mdadm: stopped /dev/md0 | root@sp1 ~ # blockdev --rereadpt /dev/sda | root@sp1 ~ # This behavior isn't new and unrelated to Debian/bookworm but was spotted while debugging an unrelated issue. FTR: we re-read the partition table (via `blockdev --rereadpt`) to ensure that /etc/fstab of the live system is up2date and matches the current system state. While this isn't stricly needed, we preserve existing behavior and also try to avoid a hard "cut" of a possibly ongoing SW-RAID sync. Change-Id: I735b00423e6efa932f74b78a38ed023576e5d306 |
2 years ago |
|
2ad306c465 |
MT#57556 Prompt for reboot/halt only in interactive mode
With our newer Grml-Sipwise ISO (v2023-06-01) being based on Debian/bookworm and recent Grml packages, our automated deployment suddenly started to fail for us: | +04:28:12 (netscript.grml:2453): echo 'Successfully finished deployment process [Fri Jun 2 04:28:12 UTC 2023 - running 576 seconds]' | ++04:28:12 (netscript.grml:2455): get_deploy_status | ++04:28:12 (netscript.grml:95): get_deploy_status(): '[' -r /srv/deployment//status ']' | ++04:28:12 (netscript.grml:96): get_deploy_status(): cat /srv/deployment//status | Successfully finished deployment process [Fri Jun 2 04:28:12 UTC 2023 - running 576 seconds] | +04:28:12 (netscript.grml:2455): '[' copylogfiles '!=' error ']' | +04:28:12 (netscript.grml:2456): set_deploy_status finished | +04:28:12 (netscript.grml:103): set_deploy_status(): '[' -n finished ']' | +04:28:12 (netscript.grml:104): set_deploy_status(): echo finished | +04:28:12 (netscript.grml:2459): false | +04:28:12 (netscript.grml:2463): status_wait | +04:28:12 (netscript.grml:329): status_wait(): [[ -n 0 ]] | +04:28:12 (netscript.grml:329): status_wait(): [[ 0 != 0 ]] | +04:28:12 (netscript.grml:2466): false | +04:28:12 (netscript.grml:2471): false | +04:28:12 (netscript.grml:2476): echo 'Do you want to [r]eboot or [h]alt the system now? (Press any other key to cancel.)' | Do you want to [r]eboot or [h]alt the system now? (Press any other key to cancel.) | +04:28:12 (netscript.grml:2477): unset a | +04:28:12 (netscript.grml:2478): read -r a | ++04:28:12 (netscript.grml:2478): wait_exit | ++04:28:12 (netscript.grml:339): wait_exit(): local e_code=1 | ++04:28:12 (netscript.grml:340): wait_exit(): [[ 1 -ne 0 ]] | ++04:28:12 (netscript.grml:341): wait_exit(): set_deploy_status error | ++04:28:12 (netscript.grml:103): set_deploy_status(): '[' -n error ']' | ++04:28:12 (netscript.grml:104): set_deploy_status(): echo error | ++04:28:12 (netscript.grml:343): wait_exit(): trap '' 1 2 3 6 15 ERR EXIT | ++04:28:12 (netscript.grml:344): wait_exit(): status_wait | ++04:28:12 (netscript.grml:329): status_wait(): [[ -n 0 ]] | ++04:28:12 (netscript.grml:329): status_wait(): [[ 0 != 0 ]] | ++04:28:12 (netscript.grml:345): wait_exit(): exit 1 As of grml-autoconfig v0.20.3 and newer, the grml-autoconfig systemd service that invokes the deployment netscript uses `StandardInput=null` instead of `StandardInput=tty` (see https://github.com/grml/grml/issues/176). Thanks to this, a logic error in our deployment script showed up. We exit the script in interactive mode, though only *afterwards* prompting for reboot/halt with `read -r a` - which of course fails if stdin is missing. As a result, we end up in our signal handler `trap 'wait_exit;' 1 2 3 6 15 ERR EXIT` and then fail the deployment. So instead prompt for "Do you want to [r]eboot or [h]alt ..." *only* in interactive mode, and while at it drop the "if "$INTERACTIVE" ; then exit 0 ; fi" so the prompt is actually presented to the user. Change-Id: Ia89beaf3c446f3701cc30ab21cfdff7b5808a6d3 |
2 years ago |
|
98d11bfc28 |
MT#57280 Run deployment status server under systemd
Manual execution of python's http.server has multiple drawbacks, like no proper logging and no service tracking/restart options, but most notably the deployment status server no longer runs when our deployment script fails. While /srv/deployment/status then still might contain "error", no one is serving that information on port 4242 any longer[1], and our daily-build-install-vm Jenkins job might then report: | VM '192.168.209.162' current state is '' - retrying up to another 1646 times, sleeping for a second | VM '192.168.209.162' current state is '' - retrying up to another 1645 times, sleeping for a second | [...] It then runss for ~1/2 hour without doing anything useful, until the Jenkins job itself gives up. By running our deployment status server under systemd, we keep the service alive also when the deployment script terminates. In case of errors we get immediate feedback: | VM '192.168.209.162' current state is 'puppet' - retrying up to another 1648 times, sleeping for a second | VM '192.168.209.162' current state is 'puppet' - retrying up to another 1647 times, sleeping for a second | VM '192.168.209.162' current state is 'error' - retrying up to another 1646 times, sleeping for a second | + '[' error '!=' finished ']' | + echo 'Failed to install Proxom VM '\''162'\'' (IP '\''192.168.209.162'\'')' [1] For our NGCP based installations we use the ngcpstatus boot option, where its status_wait trap kicks in and avoids premature exit of deployment status server. But e.g. our non-NGCP systems don't use that boot option and with this change we could get rid of the status_wait overall. Change-Id: Ibaa799358caedf31c64c37b48e3c5e889808086a |
2 years ago |
|
e6819fe674 |
MT#55944 Use ngcp-initialize-udev-rules-net to deploy 70-persistent-net.rules
Use system-tools' ngcp-initialize-udev-rules-net script to deploy the /etc/udev/rules.d/70-persistent-net.rules, no need to maintain code at multiple places. Change-Id: I81925262a8c687aa9976cbc1113568989fa53281 |
2 years ago |
|
ae7db13232 |
MT#55944 Fix networking for plain Debian systems
When building our Debian boxes for buster, bullseye + bookworm (via daily-build-matrix-debian-boxes Jenkins job), we get broken networking, so e.g. `vagrant up debian-bookworm doesn't work. This is caused by /etc/network/interfaces (using e.g. "neth0", being our naming schema which we use in NGCP, as adjusted by the deployment script) not matching the actual system network devices (like enp0s3). TL;DR: no behavior change for NGCP systems, only when building non-NGCP systems then enable net.ifnames=0 (via set_custom_grub_boot_options), but do *not* generate /etc/udev/rules.d/70-persistent-net.rules (via invoke generate_udev_network_rules) nor rename eth*->neth* in /etc/network/interfaces. More verbose version: * rename the "eth*" networking interfaces into "neth*" in /etc/network/interfaces only when running in ngcp-installer mode (this is the behavior we rely on in NGCP, but it doesn't matter for plain Debian systems) * generate /etc/udev/rules.d/70-persistent-net.rules only when running in ngcp-installer mode. While our jenkins-configs.git's jobs/daily-build/scripts/vm_clean-fs.sh removes the file anyways (for the VM use case), between the initial deployment run and the next reboot the configuration inside the PVE VM still applies, so we end up with an existing /etc/udev/rules.d/70-persistent-net.rules, referring to neth0, while our /etc/network/interfaces configures eth0 instead. * when *not* running in ngcp-installer mode, enable net.ifnames=0 usage in GRUB to disable persistent network interface naming. FTR, this change is *not* needed for NGCP, as on NGCP systems we use /etc/udev/rules.d/70-persistent-net.rules, generated by ngcp-system-tools' ngcp-initialize-udev-rules-net script also in VM use case This is a fixup for a change in git commit |
2 years ago |
|
6412814e6b |
MT#55949 Ensure we have proper date/time configuration
If the date of the running system isn't appropriate enough, then apt runs might fail with somehint like: | E: Release file for https://deb/sipwise/com/spce/mr10.5.2/dists/bullseye/InRelease is not valid yet (invalid for another 6h 19min 2s) So let's try to sync date/time of the system via NTP. Given that chrony is a small (only 650 kB disk space) and secure replacement for ntp, let's ship chrony with the Grml deployment ISO (and fall back to ntp usage in deployment script if chrony shouldn't be available). Also, if the system is configured to read the RTC time in the local time zone, this is known as another source of problems, so let's make sure to use the RTC in UTC. Change-Id: I747665d1cee3b6f835c62812157d0203bcfa96e2 |
2 years ago |
|
245c7ef702 |
MT#55861 Update Grml ISO + update to Debian/bookworm
For deploying Debian/bookworm (see MT#55524), we'd like to have an updated Grml ISO. With such a Debian/bookworm based live system, we can still deploy older target systems (like Debian/bullseye). Relevant changes: 1) Ad jo as new build-dependency, to generate build information in conf/buildinfo.json (new dependency of grml-live) 2) Always include ca-certificates, as this is required with more recent mmdebstrap versions (>=0.8.0), when using apt repositories with https, otherwise bootstrapping Debian fails. 3) Update to latest stable grml-live version v0.42.0, which: a) added support for "bookworm" as suite name |
2 years ago |
|
ad9e94efb6 |
MT#55861 Load the fake-uname.so pre-loaded library from within the chroot
We build the pre-loaded library targeting a specific Debian release, which might be different (and newer) to the release Grml was built for. This can cause missing versioned symbols (and a loading failure) if the libc in the outer system is older than the inner system. Change-Id: I84f4f307863e534fe0fff85274ae1d5db809012c |
2 years ago |
|
d1d0e61512 |
MT#55379 Use usrmerge for Debian/bookworm based systems
The transition to usrmerge has started in Debian, see https://lists.debian.org/debian-devel-announce/2022/09/msg00001.html Debian/bookworm AKA v12 will only support the merged-/usr layout. Systemd is also dropping support for unmerged-usr systems (see https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html). Deploy the expected filesystem layout accordingly, as in: 1) no-merged-usr for Debian release up and including bullseye, and 2) merged-usr starting with bookworm and newer Change-Id: I7b7b294ce12ca245cf978a787bcc20aa9753e73d |
3 years ago |
|
b372471a20 |
TT#15305 Fix ngcp-deployment-scripts usage for daily-build-matrix-debian-boxes
Git commit
|
3 years ago |
|
1d4f08b7ed |
TT#15305 development.sh: support trunk-weekly, take two
Change-Id: I83e635dc5916833d0699fd0be5a8a742ef7b40c8 |
3 years ago |
|
6661b04af0 |
TT#15305 deployment.sh: support trunk-weekly
Change-Id: Ie98ac5fa0de848cf54a96039af5532eb8012bab9 |
3 years ago |
|
8e063362ef |
TT#173500 Create tmpfiles with template name
We want to be able to track down any left-behind tmp files, so ensure we're creating them with according file names. Change-Id: I4eb44047f2eb86ba9f0a8aeeb8d6555290f60c00 |
3 years ago |
|
15aaad8edb |
TT#161150 Replace ngcpsp* with ngcpnodename option
It's needed for support of spN nodes. Sort options in deployment.sh. Remove unused boot options ngcpnonwrecfg and ngcpfillcache. Change-Id: I300e533c15b71d65e768ca2ed4b3a73eb7ec6954 |
3 years ago |
|
be237917d7 |
TT#161150 Refactor options parsing
Merge all options parsing to single point. Move options parsing to the top of the script. Parse boot options first then cmd options if they exist. Simplify some checks. Remove unused options. Change-Id: Ibcb099d9bb2ba26ffed9904c8e5065b392ecb78a |
3 years ago |
|
a99d9ff6e2 |
TT#161150 Refactoring default values and parameter parsing
Sort default values. Rework cmd parameters parsing - remove some reassign, reformat to be more clear, etc. Add some default options CROLE, EADDR, EXTERNAL_NETMASK, ROLE. Change-Id: I287facafeb53dc5390517424935c8a50932246dc |
3 years ago |
|
7b53916c30 |
TT#157450 Add extra logging entries and copy logs later
Add extra deployment statuses for grub-install and try to have more data logged. Change-Id: Id06dfad1264f781157631c51035ab219cfc30070 |
3 years ago |
|
3073c27a40 |
TT#118659 EFI support: ensure to always have a proper FAT filesystem available
If grml-debootstrap detects an existing FAT filesystem on the EFI partition, it doesn't modify/re-create it: | EFI partition /dev/nvme0n1p2 seems to have a FAT filesystem, not modifying. The underlying check is execution of `fsck.vfat -bn $DEVICE`. Now with fsck.fat from dosfstools v4.1-2 as present in Debian/buster we got: | root@grml ~ # fsck.vfat -bn /dev/nvme0n1p2 | fsck.fat 4.1 (2017-01-24) | 0x41: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt. | Automatically removing dirty bit. | There are differences between boot sector and its backup. | This is mostly harmless. Differences: (offset:original/backup) | 0:00/eb, 82:00/46, 83:00/41, 84:00/54, 85:00/33, 86:00/32, 87:00/20 | , 88:00/20, 89:00/20, 510:00/55, 511:00/aa | Not automatically fixing this. | Leaving filesystem unchanged. | 1 root@grml ~ # Now with dosfstools v4.2-1 as present in Debian/bullseye, this might become: | root@grml ~ # fsck.vfat -bn /dev/nvme0n1p2 | fsck.fat 4.2 (2021-01-31) | There are differences between boot sector and its backup. | This is mostly harmless. Differences: (offset:original/backup) | 0:00/eb, 65:01/00, 82:00/46, 83:00/41, 84:00/54, 85:00/33, 86:00/32 | , 87:00/20, 88:00/20, 89:00/20, 510:00/55, 511:00/aa | Not automatically fixing this. In such situations we end up with an incomplete/broken EFI partition, which breaks within our efivarfs post-script: | Mounting /dev/nvme0n1p2 on /boot/efi | mount: /boot/efi: wrong fs type, bad option, bad superblock on /dev/nvme0n1p2, missing codepage or helper program, or other error. | ESC[31;01m-> Failed (rc=1)ESC[0m | ESC[32;01m*ESC[0m Removing chroot-script again | ESC[32;01m*ESC[0m Executing post-script /etc/debootstrap/post-scripts//efivarfs | Executing /etc/debootstrap/post-scripts//efivarfs | Mounting /dev (via bind mount) | Mounting /boot/efi | mount: /boot/efi: special device UUID= does not exist. Change-Id: I46939b4e191982a84792f3aca27c6cc415dbdaf4 |
4 years ago |
|
9ec2c3d459 |
TT#118659 EFI support: provide workaround for grml-debootstrap versions <=0.96
When we run current versions of deployment.sh, which include the fix from commit |
4 years ago |
|
cf01ec9257 |
TT#118659 Ensure that wiping disk signatures works more reliably
Noticed while debugging the EFI situation, that wipefs calls might fail, like: | # wipefs -a /dev/nvme0n1 | wipefs: error: /dev/nvme0n1: probing initialization failed: Device or resource busy Using the force option, we *could* get past this error: | # wipefs -af /dev/nvme0n1 | /dev/nvme0n1: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 8 bytes were erased at offset 0x3a38b2de00 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa But quoting from wipe2fs(8): | -f, --force | Force erasure, even if the filesystem is mounted. This is required in order to erase a partition-table signature on a block device. So while this would work, there might be unexpected side effects. Instead let's use a different approach: if we remove the LVM signatures *before* running wipefs, it behaves as expected: | root@grml ~ # pvs | PV VG Fmt Attr PSize PFree | /dev/nvme0n1p3 ngcp lvm2 a-- <232.41g <222.41g | root@grml ~ # vgs | VG #PV #LV #SN Attr VSize VFree | ngcp 1 1 0 wz--n- <232.41g <222.41g | root@grml ~ # lvs | LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert | root ngcp -wi-a----- 10.00g | root@grml ~ # wipefs -a /dev/nvme0n1 | wipefs: error: /dev/nvme0n1: probing initialization failed: Device or resource busy | 1 root@grml ~ # vgremove -ff ngcp | Logical volume "root" successfully removed | Volume group "ngcp" successfully removed | root@grml ~ # pvremove /dev/nvme0n1p3 --force --force --yes | Labels on physical volume "/dev/nvme0n1p3" successfully wiped. | root@grml ~ # wipefs -a /dev/nvme0n1 | /dev/nvme0n1: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 8 bytes were erased at offset 0x3a38b2de00 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa | /dev/nvme0n1: calling ioctl to re-read partition table: Success FTR, when using wipefs' --force option, it still leaves behind the LVM signatures anyways: | root@grml ~ # pvs | PV VG Fmt Attr PSize PFree | /dev/nvme0n1p3 ngcp lvm2 a-- <232.41g <222.41g | root@grml ~ # vgs | VG #PV #LV #SN Attr VSize VFree | ngcp 1 1 0 wz--n- <232.41g <222.41g | root@grml ~ # lvs | LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert | root ngcp -wi-a----- 10.00g | root@grml ~ # wipefs -a /dev/nvme0n1 | wipefs: error: /dev/nvme0n1: probing initialization failed: Device or resource busy | 1 root@grml ~ # wipefs -af /dev/nvme0n1 | /dev/nvme0n1: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 8 bytes were erased at offset 0x3a38b2de00 (gpt): 45 46 49 20 50 41 52 54 | /dev/nvme0n1: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa | root@grml ~ # pvs | PV VG Fmt Attr PSize PFree | /dev/nvme0n1p3 ngcp lvm2 a-- <232.41g <222.41g | root@grml ~ # vgs | VG #PV #LV #SN Attr VSize VFree | ngcp 1 1 0 wz--n- <232.41g <222.41g | root@grml ~ # lvs | LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert | root ngcp -wi-a----- 10.00g So we'd still have to wipe the LVM signatures, while enabling wipefs' --force option could lead to unexpected behaviors. Verified with: | root@grml ~ # wipefs --version | wipefs from util-linux 2.36.1 | root@grml ~ # uname -a | Linux web01a 5.10.0-6-amd64 #1 SMP Debian 5.10.28-1 (2021-04-09) x86_64 GNU/Linux | root@grml ~ # lvm version | head -3 | LVM version: 2.03.11(2) (2021-01-08) | Library version: 1.02.175 (2021-01-08) | Driver version: 4.43.0 Change-Id: Ie4f7b2797d2dcfc27601792d6102a765e4c60c47 |
4 years ago |
|
f9aea18c19 |
TT#118659 Fixup for efivarfs handling with grml-debootstrap v0.98
This is a followup fixup for commit |
4 years ago |
|
a56c4454a3 |
TT#105151 Do the renaming eth*->neth* outside of the "if $NGCP_INSTALLER" block
Jobs like daily-build-matrix-debian-boxes build plain Debian machines, not NGCP-based ones. At the moment we're generating the udev-rules for network renaming unconditionally, so we have to do it consistently, either both conditionally and not for "plain" systems, or both unconditionally, so network can be brought up by a correct /etc/network/interfaces after the devices are brought up with the new names. There is a good-ish argument for keeping using eth0, as it is more of a default, but we're already deviating from the default for several years and Debian stable releases by having these names and not ones like "ens18" or "enp4s0f2" which is the default in Debian nowadays, at least since buster. So it is probably better to keep it consistent with our other machines and use "neth*" naming for those too. Change-Id: I6b3b49a1769894580df768abb817ae5196e65963 |
4 years ago |
|
eaecf474c2 |
TT#105151 Stop removing just-generated udev-rules for network in VMs
The code removed was enabled when $VAGRANT=true, and this happened when passing "vagrant" parameter to deployment.sh, which is done in places like proxmox-vm-clone job, the base of many of our tests machines. VMs do not necessarily have the same hardware configuration, so removing udev-rules for network devices makes sense in principle. Especially when since the beginning we were using network devices named "eth*" everywhere, even if in the last years we had to use net.ifnames=0 and udev-rules files in hardware to keep using "eth*" names. However, now with mr9.5 and the move to Debian bullseye we have to start using different names, and we settled on the direct translation to "neth*". So we need a way to assign whatever network devices the machines come with, including VMs, to names "neth*". (If we used the new-permanent device names like ens18 or enp3s0f1 we would have to adapt network.yml and files like network interface, and they would be different across all the different machines (HW and VM) so this is not a better or faster solution to the problem.) So, back to the topic of removal of this udev-rules file: in many cases in our test infra, the machines are built "in place" and then rebooted for upgrades or tests, in princicple with the same hardware configuration, so there is no need to remove these files. In cases where the underlying (virtualized) hardware changes, e.g. to use like local VirtualBox-based vagrant machines, we will need to adapt the rules for the existing devices. Change-Id: I57e39a2ec6849f3b5bb8f6cf518e2a2923ec19cb |
4 years ago |
|
44750996be |
TT#105151 Rename network interfaces eth*->neth*
Using "eth*" names was discouraged for many years, we've been finding problems here and there and working around them with the help of udev-rules (/etc/udev/rules.d/70-persistent-net.rules) to map address interfaces according to PCIIDs, using "net.ifnames=0" as Linux kernel boot parameter when booting in GRUB, etc. Finally we found unsurmountable problems when moving to Debian bullseye (mr9.5), because as we attempt to rename interfaces in some hardware systems that we use, we got race conditions and clashes with renaming that we could not solve in other ways. We had different alternatives: - Use names purely deterministic, based on PCI paths (for example "enp4s0f1"), MAC address or other of the alternatives, which would be "definitive", but given that we have a diversity of hardware and VM installations in customers the devices in different systems would be different, and the fact that it would be easier to mistype or confuse them makes this not ideal. - Use names purely based on functionality, like for example "ha0", "ext0" or "int0". The problem in this case is that we would have to find names that would satisfy everyone (and there's no time for doing this at this point), that different of our system types are quite different (e.g. Pro without bonds, Carrier with bonds and many vlans by default; using the same hardware), and some customers with different installations or needs (e.g. using VMs) have also totally different network configuration -- so any attempt to unify this to make good use of the functionality-based names would be very challenging. - Finally, there's the option to use some symbolic names similar to traditional names like "eth0", but without being exactly this. Popular names in general, although there's no wide consensus, are names like "net0" and "lan0". Talking with groups involved in deploying and maintaining the system, the decision was taken to move to names not purely deterministic, and there's no time for purely symbolic (they also didn't express much interest on them), and prefer something more traditional that they are already used too. Instead of names like "net0" or "lan0", they prefer the more direct mapping to existing interfaces like "neth0". This is ugly or slighly discomforting to use for some, but since the main users (among us) of these names prefer them, so be it. It has the advantage of having a very simple and mechanichal translation based on the current names, which is an advantage especially at the critical time of upgrading existing systems to the new name. Change-Id: I4a168c7d81e40f609749f77a509d2acb72d3a9d3 |
4 years ago |
|
a50903a30c |
TT#105151 Stop adding "net.ifnames=0" to grub config
This is commit |
4 years ago |
|
d6b5097a86 |
TT#105151 Run installer under "eatmydata", unless disabled by parameter
Run the installer under "eatmydata" to speed up the process. Also add some more information about timing. In some VMs that we install daily ({ce,pro,carrier}-trunk.mgm) we have the following timings: ce-runner, no eatmydata: 162 seconds, 2 mins 42 secs ce-runner, with eatmydata: 142 seconds, 2 mins 22 secs pro-runner, no eatmydata: 246 seconds, 4 mins 06 secs pro-runner, with eatmydata: 217 seconds, 3 mins 37 secs So in these machines, for CE we save about 20 seconds, which is not much in total but it's about 12.5% saving; and in Pro about 30 seconds (and twice, once per machine, so about a minute in total), which is about 12.2% as well. In Carrier, which is mostly equivalent to Pro in this respect and typically at least 8 machines, it would mean about 4 mins in total. When installing in hardware in previous days, maybe due to the disks being slower, the total installation time was slightly slower: pro-hardware (Lenovo ThinkSystem SR250), with eatmydata: 226 seconds, 3 mins 46 secs Installing without eatmydata was not measured yet in hardware, but given that the time to install is similar to the case of pro-runner, probably the performance gain is similar too. This looks like a relevant saving, the risk of things going wrong are minimal, so enable it by default. Change-Id: I8267fad08ff337c02801fb8fad0433d9b6d9f4c2 |
4 years ago |
|
ab62171c49 |
TT#105151 Revert "TT#105151 Stop adding "net.ifnames=0" to grub config"
This reverts commit
|
4 years ago |
|
cd50e4934c |
TT#105151 Stop adding "net.ifnames=0" to grub config
Change-Id: I9a2af93c31f7bd4ab93f4e629c3faa2624291be0 |
4 years ago |
|
0c746e0515 |
TT#104381 '-' is a valid character that appears in PCIID sometimes
Change-Id: Id94023afa1df8377f023e69f21601d07b15f2fd4 |
4 years ago |
|
535e6df392 |
TT#118659 Use "efivarfs" instead of "efivars" + mount /sys/firmware/efi/efivars for efibootmgr
Current trunk installations based on bullseye using recent Grml
environments are broken, as EFI environments running with recent kernel
versions (>=5.10) aren't properly detected anymore.
This is caused by the missing efivars kernel module.
CONFIG_EFI_VARS is no longer available since
|
4 years ago |
|
93209fb893 |
TT#122950 Disable building database of manual pages
The "Building database of manual pages ..." of mandb(8) is invoked during Debian package installations, and takes a considerable amount of time[1]. By disabling this, we can speed up our installation process, similar to what we already do with all our build environments. If someone really needs the man-db database (for apropos(1) or whatis(1) usage), then invoking `systemctl restart man-db.service` provides that on demand. FTR: there are also /etc/cron.daily/man-db + /etc/cron.weekly/man-db, though they don't do anything when running under systemd. There's also man-db.timer, though we don't have it enabled by default on our NGCP systems. [1] Demo from a running PRO system: | root@sp2:~# rm -rf /var/cache/man | root@sp2:~# time systemctl restart man-db.service | | real 1m18.357s | user 0m0.000s | sys 0m0.009s Change-Id: If98007860490adc5ad954e8c36000abd7281931b |
4 years ago |
|
c73a063f52 |
TT#118659 Add options to install bullseye
Add options to install bullseye in all places where buster is used, use it as default when possible, and keep these for the moment. Switch to bullseye in Dockerfile. Change-Id: I2f693982ba92a671a6f2254c5a245a1d05231404 |
4 years ago |
|
6e1c841305 |
TT#119602 Hide errexit on VBoxLinuxAdditions.run call
The call:
UTS_RELEASE="${KERNELVERSION}" LD_PRELOAD="${FAKE_UNAME}" \
grml-chroot "${TARGET}" /media/cdrom/VBoxLinuxAdditions.run --nox11
fails with:
Running in chroot, ignoring request: daemon-reload
Before
|
4 years ago |
|
8a54cd1374 |
TT#119602 Properly handle trap also in case of errors in functions
Quoting from "man bash" about `-E` (AKA errtrace): | If set, any trap on ERR is inherited by shell functions, command | substitutions, and commands executed in a subshell environment. | The ERR trap is normally not inherited in such cases. To demonstrate the problem see this short shell script: | % cat foo | set -eu -o pipefail | | bailout() { | echo "Bailing out because of error" >&2 | exit 1 | } | trap bailout 1 2 3 6 9 14 15 ERR | | foo() { | echo "Executing magic" | magic | } | | foo | echo end If "magic" can't be executed, then this fails as follows: | % bash ./foo | Executing magic | ./foo: line 11: magic: command not found But it doesn't invoke the bailout function via trap. When using `set -eE` (AKA errexit + errtrace), instead of only `set -e` (errexit), then it behaves as expected though: | % bash ./foo | Executing magic | ./foo: line 11: magic: command not found | Bailing out because of error Change-Id: I26396b87d4a391a75997c061e866709daa57870e |
4 years ago |
|
91e047a486 |
TT#105407 Ensure lvm2 is present before grub-install is executed
grub-pc >=2.04-11 has a new behavior regarding /boot/grub/i386-pc/ handling, where we end up with an empty /boot/grub/i386-pc/ after *successful* grub-install execution: | root@grml ~ # vgchange -ay | 3 logical volume(s) in volume group "ngcp" now active | root@grml ~ # mount /dev/mapper/ngcp-root /mnt | root@grml ~ # grml-chroot /mnt /bin/bash | Writing /etc/debian_chroot ... | (spce)root@grml:/# cd | (spce)root@grml:~# grub-install /dev/sda | Installing for i386-pc platform. | Installation finished. No error reported. | (spce)root@grml:~# ls -la /boot/grub/i386-pc/ | total 16 | drwxr-xr-x 2 root root 12288 Dec 16 12:04 . | drwxr-xr-x 4 root root 4096 Dec 16 12:07 .. This causes the installed system to fail to boot with: | GRUB loading.. | Welcome to GRUB! | | error: file `/boot/grub/i386-pc/normal.mod' not found. | grub rescue> _ The underlying issue is that recent grub versions unlink the files inside /boot/grub/i386-pc, though it doesn't report anything about it (even under `--verbose` execution). This is triggered in our situation, as lvm2's vgs binary isn't present yet. In earlier versions of grub this wasn't causing any problems and grub-install happily installed the files inside /boot/grub/i386-pc, even though we installed lvm2 only afterwards via our metapackages. To ensure lvm2 is available during installation time within grml-debootstrap, explicitly add to it list of packages to be installed. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=977544 for further details regarding the grub bug. Change-Id: I27a1cd18777526eb26b838fae88d4d87b6e93467 |
4 years ago |
|
6ce51a8c0d |
TT#104221 Ensure to have fake-uname.so available also for plain images
We install virtualbox-guest-additions in the target system for usage
with VirtualBox and shared folders via Vagrant. We invoke the
VBoxLinuxAdditions.run machinery from the running Grml live system. But
the target systems usually has a different kernel package and version
installed, so we have to apply some tricks to get it working. This is
where we rely on fake-uname.so.
Since commit
|
4 years ago |
|
ed52e8fe7a |
TT#104221 Use bullseye repos in ensure_packages_installed appropriately
The ensure_packages_installed function ensures that specified packages are present during runtime. This is used e.g. for installation of virtualbox-guest-additions-iso Debian package from within vagrant_configuration(), which is used to execute /media/cdrom/VBoxLinuxAdditions.run inside the target system. We can't use random Debian repositories though, as the package dependencies need to match the running live system. So far we only used the buster repository, as our current grml-sipwise ISOs are based on something close to buster. On the other hand we can't use virtualbox-guest-additions-iso from Debian/buster in our Debian/bullseye Vagrant boxes, as /sbin/mount.vboxsf doesn't work then. So use the bullseye repository if the release of the target system is bullseye, which seems to work with our current Grml ISOs and current state of bullseye. Change-Id: Iaf965daa6ff7a62e2b3bd8c55b8f761abd94c241 |
4 years ago |
|
3a5149e01c |
TT#100201 Support Debian/bullseye by dropping stretch+buster checks
Nowadays we only deploy stretch + buster based Debian systems, so drop those release specific checks to also support bullseye and newer Debian releases. Change-Id: Ibf3d1527ccaeba60526a730e6886e6521c08d20e |
5 years ago |
|
862fb155f0 |
TT#83753 Port status server to py3
The /usr/bin/python symlink/binary no longer exists in recent Grml-Sipwise ISOs and python3 doesn't ship SimpleHTTPServer but http.server instead. Change-Id: I6677e8a416b142034d99d5b1d2b11ba74d87a6ec |
5 years ago |