Ansible – updating proxmox host kernel with LXC shared GPU

This is to automate the updating of proxmox host when there is a kernel update which will break the LXC link to the GPU.

It just requires you to reinstall the graphics driver and do a reboot otherwise.

This is after you have done an update / upgrade of your proxmox host.
You will have to change the IP Addresses for your setup.

For me..
10.77.69.2 – Proxmox Host
10.77.69.103 – LXC Plex

########
- hosts: nvidia
  become: true
  become_user: root
  tasks:
    - name: Wait for 10.77.69.2 to become available
      wait_for_connection:
        delay: 5
        timeout: 300

    - name: Check if NVIDIA kernel module is loaded
      shell: lsmod | grep -q '^nvidia'
      register: nvidia_module_check
      ignore_errors: true

    - name: Set NVIDIA module check result as fact
      set_fact:
        nvidia_module_rc: "{{ nvidia_module_check.rc }}"

    - name: Reinstall NVIDIA driver if module is not loaded
      shell: sh /root/NVIDIA-Linux-x86_64-535.154.05.run --silent
      args:
        executable: /bin/bash
      when: nvidia_module_check.rc != 0

    - name: Set fact if NVIDIA driver was installed
      set_fact:
        driver_installed: true
      when: nvidia_module_check.rc != 0

    - name: Reboot system if NVIDIA driver was reinstalled
      reboot:
      when: nvidia_module_check.rc != 0

    - name: Wait for 10.77.69.2 to become available after reboot
      wait_for_connection:
        delay: 10
        timeout: 600
      when: nvidia_module_check.rc != 0

########
- hosts: plex
  become: true
  become_user: root
  tasks:
    - name: Install NVIDIA driver in LXC
      shell: sh /root/NVIDIA-Linux-x86_64-535.154.05.run --no-kernel-module --silent
      args:
        executable: /bin/bash
      when: hostvars['10.77.69.2'].driver_installed | default(false)

    - name: Reboot 10.77.69.103
      reboot:
      when: hostvars['10.77.69.2'].driver_installed | default(false)

    - name: Wait for 10.77.69.103 to become available
      wait_for_connection:
        delay: 10
        timeout: 300
      when: hostvars['10.77.69.2'].driver_installed | default(false)


This will check to see if the kernels for nvidia (my gpu) has been loaded, if not it will reinstall in silent mode. This will also flag a GPU install in ansible to also reinstall the GPU driver in the LXC, only if it needs though.