In LP: #2009141, we are hitting kernel limits and pyudev buffer limits.
We don't care about specific events, so much as getting one event,
waiting for things to calm down, then reprobing.
Outright disable the event monitor, and re-enable later. If there is a
storm of events, testing has shown that stopping the listener is not
enough.
(cherry picked from commit b11726d398)
Before using fs_controller.is_core_boot_classic(), we wait for the call
to /meta/confirmation?tty=xxx. That said, in semi-automated desktop
installs, sometimes the call to /meta/confirmation happens before
marking storage configured. This leads to the following error:
File "subiquity/server/controllers/oem.py", line 209, in apply_autoinstall_config
await self.load_metapkgs_task
File "subiquity/server/controllers/oem.py", line 81, in list_and_mark_configured
await self.load_metapackages_list()
File "subiquitycore/context.py", line 149, in decorated_async
return await meth(self, **kw)
File "subiquity/server/controllers/oem.py", line 136, in load_metapackages_list
if fs_controller.is_core_boot_classic():
File "subiquity/server/controllers/filesystem.py", line 284, in is_core_boot_classic
return self._info.is_core_boot_classic()
AttributeError: 'NoneType' object has no attribute 'is_core_boot_classic'
Receiving the confirmation before getting the storage configured is
arguably wrong - but let's be prepared for it just in case.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
(cherry picked from commit 59849f7f45)
When v2/orig_config is called too early, the load_probe_data function
will fail because probe_data is None:
Traceback (most recent call last):
File "subiquity/common/api/server.py", line 164, in handler
result = await implementation(**args)
File "subiquity/server/controllers/filesystem.py", line 1029, in v2_orig_config_GET
model = self.model.get_orig_model()
File "subiquity/models/filesystem.py", line 1428, in get_orig_model
orig_model.load_probe_data(self._probe_data)
File "subiquity/models/filesystem.py", line 1894, in load_probe_data
for devname, devdata in probe_data["blockdev"].items():
TypeError: 'NoneType' object is not subscriptable
Make sure we don't dereference model._probe_data if it is None.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
(cherry picked from commit 7de6f0538b)
We recently made sure that after doing a snap refresh, the rich mode
(i.e., either rich or basic) is preserved. This was implemented by
storing the rich mode in a state file. When the client starts, it loads
the rich mode from said state file if it exists.
Unfortunately, on s390x, it causes installs to default to basic mode.
This happens because on this architecture, a subiquity install consists
of:
* a first client (over serial) showing the SSH password
* a second client (logging over SSH) actually going through the
installation UI.
Since the first client uses a serial connection, the state file is
created with rich-mode set to basic. Upon connecting using SSH, the
state file is read and the rich-mode is set to basic as well.
Fixed by storing the rich-mode in two separate files, one for clients
over serial and one for other clients.
LP: #2036096
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
(cherry picked from commit c95261e0de)
While these changes are not supposed to take nearly this long,
per LP: #2034715 we know that they are, and that some systems will
correctly perform the finish_install() step if just given more time.
(cherry picked from commit 5a573f2cef)
This curtin rev adds the following:
Dan Bungert (3):
extract: log source information
tests/data: 4k sector disk
storage_config: handle partitions on 4k disk
Nick Rosbrook (1):
apt: disable default deb822 migration
(cherry picked from commit ea7b683d8e)
For ZFS, we recently introduced a call to $(umount --recursive /target)
slighly before shutting down or rebooting. Unfortunately, on s390x, we
also had a very late call to chreipl to make the firmware boot from the
installed system.
The call to chreipl reads data from /target/boot, and it fails if the
filesystem is no longer mounted.
Fixed by calling chreipl earlier in the installation, during the
postinst phase rather than after the user clicks "reboot".
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
(cherry picked from commit cf828eeb8d)
Making an install that used an existing RAID failed because of an
attempt to log the size of the RAID when rendering the curtin config.
This turns out to be because when the client sends the storage objects
back to the server it loses all the "api only" data including the udev
data that is needed to display the size.
In some sense this is a bit silly, we could just drop the log statement
and it would be fine but I think it's probably better to always have the
full storage objects in the server (until we can get away from this
hackish API anyway).
(cherry picked from commit 4d24865a63)
Adding this import means a dependency on probert, which also means
anybody importing subiquity.common.types also has that requirement.
The make-kbd-info script imports types, and that steps was causing
snapcraft build failures due to not finding probert.
When the URL of the security archive is unset, curtin will set it to the
URL of the primary archive.
This is not the behavior we want for Ubuntu installations. On amd64 (and
i386), the URL of the security archive should be set to
http://security.ubuntu.com/ubuntu
On other architectures, it should be set to
http://ports.ubuntu.com/ubuntu-ports
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Mirror testing should focus on testing the primary mirror, not the
security archive - therefore we disable the -security suite.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
When a network interface is disconnected from the system (e.g.,
physically removed if it's a USB adapter), probert asynchronously calls
the del_link() method.
Upon receiving this notification, Subiquity server wants to send an
update to the Subiquity clients. The update contains information about
the interface that disappeared - which is obtained through a call to
netdev_info.
Unfortunately, for Wi-Fi and Ethernet interfaces, netdev_info
dereferences the NetworkDev.info variable. Interfaces that no longer
exist on the system (and also interfaces that do not yet exist), have
their "info" variable set to None - so an exception is raised when
dereferencing it.
Wi-Fi interface:
File "subiquitycore/models/network.py", line 227, in netdev_info
scan_state=self.info.wlan['scan_state'],
AttributeError: 'NoneType' object has no attribute 'wlan'
Ethernet interface:
File "subiquitycore/models/network.py", line 201, in netdev_info
is_connected = bool(self.info.is_connected)
AttributeError: 'NoneType' object has no attribute 'is_connected'
Fixed by making sure netdev_info does not raise if the dev.info variable
is None. This is a valid use-case.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
When accessing the Help menu, Subiquity looks up the IP addresses
currently configured - so it knows whether to show the "Help on SSH
access" option.
Unfortunately, it also looks for IP addresses on devices that were
"configured" through the network screen but that still do not exist in
the system. When such a device exist (e.g., a bond), the Subiquity
client crashes with the following exception:
Traceback (most recent call last):
File "subiquity/common/api/server.py", line 164, in handler
result = await implementation(**args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "subiquity/server/server.py", line 117, in ssh_info_GET
ips.extend(map(str, dev.actual_global_ip_addresses))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "subiquitycore/models/network.py", line 394, in actual_global_ip_addresses
for _, addr in sorted(self.info.addresses.items())
^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'addresses'
A similar crash is observed when calling /network/global_addresses after
creating the bond.
Fixed by only checking the IP addresses of devices that have a
probert.network.Link instance (i.e., they exist in the system).
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
When a Wi-Fi interface is present in the machine configuration (e.g.,
mwhudson.json), the GUI seemingly ignores it. This happens because there
is a filter on the server side which only returns Wi-Fi interfaces if
the wlan_support_install_state() function returns
PackageInstallState.DONE.
However, calling the /network endpoint shows that the state is set to
the wrong value:
{"wlan_support_install_state": "NOT_NEEDED"}
This turns out to be inconsistent because:
* we lean on a PackageInstaller instance to tell if wpasupplicant is
installed (this is what the wlan_support_install_state() function
reflects) ; but
* in dry-run mode, we pretend to install wpasupplicant without
actually relying on the PackageInstaller instance.
Fixed by using the PackageInstaller instance to install the
wpasupplicant package - with a special implementation that only pretends
to install it. This is enough to make the PackageInstaller instance
think the package is installed.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
When the server raises an exception in a HTTP request handler context,
more often than not, the exception is sent back to the client in the
body.
Additionally, the message of the exception (if any), is also copied as
is in a x-error-msg HTTP header.
That said, HTTP headers must obey strict rules. The "\r\n" sequence
indicate the end of the current HTTP header. When using aiohttp, the
library rejects any header that has a "\r" or "\n" in its value:
ValueError: Newline or carriage return character detected in HTTP status message or header. This is a potential security issue.
As an example, any curtin.util.ProcessExecutionError exception will
contain "\n" characters when converted into a string.
We now encode the error message as JSON before copying it in the HTTP
header.
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>