I managed to get the machine to boot with nvidia-driver-470
Code:
NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 11.4
Code:
paul@clio:~$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:1c.7/0000:09:00.0/0000:0a:07.0/0000:0e:00.0 ==
modalias : pci:v000014E4d000043A0sv00001043sd00008659bc02sc80i00
vendor : Broadcom Inc. and subsidiaries
model : BCM4360 802.11ac Wireless Network Adapter
driver : bcmwl-kernel-source - distro non-free
== /sys/devices/pci0000:00/0000:00:03.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001B06sv00003842sd00006696bc03sc00i00
vendor : NVIDIA Corporation
model : GP102 [GeForce GTX 1080 Ti]
manual_install: True
driver : nvidia-driver-515 - distro non-free recommended
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-510-server - distro non-free
driver : nvidia-driver-515-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
== /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0 ==
modalias : pci:v000010DEd0000128Bsv00001462sd00008C93bc03sc00i00
vendor : NVIDIA Corporation
model : GK208B [GeForce GT 710]
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
It is showing as manually installed, which I imagine means that it won't be automatically updated. That is not a problem for now.
I really want to make sure /boot is now cleaned and consistent. The following commands show the situation:
Code:
paul@clio:~$ uname -r
5.4.0-121-generic
Code:
paul@clio:~$ dpkg -l | tail -n +6 | grep -E 'linux-image-[0-9]+'
rc linux-image-5.4.0-107-generic 5.4.0-107.121 amd64 Signed kernel image generic
ii linux-image-5.4.0-110-generic 5.4.0-110.124 amd64 Signed kernel image generic
ii linux-image-5.4.0-121-generic 5.4.0-121.137 amd64 Signed kernel image generic
Code:
paul@clio:~$ ls -lh /boot
total 368M
-rw-r--r-- 1 root root 233K Apr 14 13:19 config-5.4.0-110-generic
-rw-r--r-- 1 root root 233K Jun 15 14:13 config-5.4.0-121-generic
drwx------ 3 root root 4.0K Jan 1 1970 efi
drwxr-xr-x 4 root root 4.0K Jul 9 03:51 grub
lrwxrwxrwx 1 root root 28 Jul 9 03:51 initrd.img -> initrd.img-5.4.0-121-generic
-rw------- 1 root root 55M Jul 9 01:02 initrd.img-5.4.0-110-generic
-rw------- 1 root root 56M Jul 10 02:54 initrd.img-5.4.0-121-generic
-rw------- 1 root root 111M Aug 25 2021 initrd.img-5.4.0-77-generic
-rw------- 1 root root 111M Sep 8 2021 initrd.img-5.4.0-81-generic
lrwxrwxrwx 1 root root 28 Jul 9 01:05 initrd.img.old -> initrd.img-5.4.0-110-generic
drwx------ 2 root root 16K Jan 22 2021 lost+found
-rw-r--r-- 1 root root 179K Aug 18 2020 memtest86+.bin
-rw-r--r-- 1 root root 181K Aug 18 2020 memtest86+.elf
-rw-r--r-- 1 root root 181K Aug 18 2020 memtest86+_multiboot.bin
-rw------- 1 root root 4.6M Apr 14 13:19 System.map-5.4.0-110-generic
-rw------- 1 root root 4.6M Jun 15 14:13 System.map-5.4.0-121-generic
lrwxrwxrwx 1 root root 25 Jul 9 01:01 vmlinuz -> vmlinuz-5.4.0-121-generic
-rw------- 1 root root 14M Apr 14 13:56 vmlinuz-5.4.0-110-generic
-rw------- 1 root root 14M Jun 15 14:18 vmlinuz-5.4.0-121-generic
lrwxrwxrwx 1 root root 25 Jul 9 01:05 vmlinuz.old -> vmlinuz-5.4.0-110-generic
There seem to be some left over images taking a lot of space:
initrd.img-5.4.0-77-generic
initrd.img-5.4.0-81-generic
Can I just delete these files ? Are there any other places where bits of these old kernels might be lieing around ? I don't know how the system got into such a state, but it has persisted for a long time as all attempts to fix it failed. Finally, thanks to help on this forum, things seem to be getting better.