1.GPU直通
配置显卡直通¶
- 在宿主机配置步骤:
0、前置条件:在宿主机上配置IOMMU、安装vGPU驱动
# 启用VT-d 的iommu功能和SR-IOV
# vi /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto spectre_v2=retpoline rd.lvm.lv=vg00/root rd.lvm.lv=vg00/swap rhgb quiet intel_iommu=on iommu=pt pci=realloc"
GRUB_DISABLE_RECOVERY="true"
## 如果是amd的cpu,则需要修改成amd_iommu=on
## 更新grub并重启生效
# grub2-mkconfig -o /boot/grub2/grub.cfg
# reboot
## 如果系统是UEFI启动,则要执行以下命令(在这折腾了半天功夫)
# grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
# reboot
## 检查是否生效
# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-1160.59.1.el7.x86_64 root=/dev/mapper/vg00-root ro crashkernel=auto
spectre_v2=retpoline rd.lvm.lv=vg00/root rd.lvm.lv=vg00/swap rhgb quiet intel_iommu=on iommu=pt pci=realloc
## 注:启用iommu和SR-IOV功能需要重启!另外,还需要确认已经在服务器BIOS中开启VT和SR-IOV
## 安装NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run
# lsmod |grep nouveau # 确认已经关闭nouveau
# unzip NVIDIA-GRID-Linux-KVM-510.47.03-511.65.zip
# chmod +x NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run
# yum -y install gcc make kernel-devel elfutils-libelf-devel
# yum -y install kernel kernel-devel kernel-header
# bash NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run # 一路下一步完成安装!
## 如果提示kernel-devel未安装,则需要先安装kernel-devel,并确保版本和kernel、kernel-headers一致
# lspci -DD|grep NVIDIA
0000:af:00.0 VGA compatible controller: NVIDIA Corporation Device 2208 (rev a1)
0000:af:00.1 Audio device: NVIDIA Corporation GA102 High Definition Audio Controller (rev a1)
echo 0000:af:00.0 > /sys/bus/pci/drivers/nvidia/unbind
echo "vfio-pci" > /sys/bus/pci/devices/0000\:af\:00.0/driver_override
echo 0000:af:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
echo 0000:af:00.1 > /sys/bus/pci/drivers/nvidia/unbind
echo "vfio-pci" > /sys/bus/pci/devices/0000\:af\:00.1/driver_override
echo 0000:af:00.1 > /sys/bus/pci/drivers/vfio-pci/bind
如果bind/unbind出现如下错误,则到设备id所在目录进行unbind
- 查看vfio-pci绑定是否生效,能看到
kernel driver in use: vfio-pci就说明成功了。# lspci -nnk -d 10de: ... af:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2208] (rev a1) Subsystem: ASUSTeK Computer Inc. Device [1043:884c] Kernel driver in use: vfio-pci Kernel modules: nouveau, nvidia_vgpu_vfio, nvidia af:00.1 Audio device [0403]: NVIDIA Corporation GA102 High Definition Audio Controller [10de:1aef] (rev a1) Subsystem: ASUSTeK Computer Inc. Device [1043:884c] Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel这时候在宿主机上执行 nvidia-smi 就看不到要直通的显卡了。 另外,需要注意的是,直通时,要把该显卡对应的
Audio也设置成vfio-pci,否则在直通时会出错。