debian安装dpdk+OVS交换机以及配合KVM代码模板
本帖最后由 塔奇克马 于 2018-7-2 04:04 编辑万兆虚拟交换机dpdk加速.
99%玩虚拟机的人应该是用不上的,优化方面不好搞.这个需要配合10G/万兆网络用的.一些加速特性也是万兆网卡上的.99%需要虚拟交换机的人只要macvtap bridge就够用了
这个是利用阻止CPU中断和共享内存还有用户态机制来加速的.
但是我还是折腾了一下,如果谁有需求可以玩一玩.
如果测速不理想,请自行翻阅文档靠自己.
基本配置需求:4G内存
推荐需求:万兆网卡 支持1G大内存页面的CPU 内存越大越好 sr-iov ACS RSS 等
安装使用的系统是openmediavault 基于debian 9(stretch)
代价:所有用到这个的虚拟机必须使用大内存页,交换机本身需要用掉至少1025M的内存,如果按照2M大内存计算就是513个.dpdk所用的轮询模式需要占用一个逻辑核心 100%.
首先去下载两个压缩包dpdk-stable-18.02.2和openvswitch-2.9.0放到一个文件夹下(版本不能变),比如我放到/usr/src/下面
先安装点软件哈
apt install xz-utils
apt-get install libnuma-dev -y
apt-get install linux-headers-`uname -r` -y
apt install gcc make autoconf automake libtool openssl libssl-dev python2.7-dev python3.5-devdesktop-file-utils groff graphvizcheckpolicy selinux-policy-dev python-sphinx python-twisted-core python-zope.interface libcap-ng-dev libpcap0.8-dev-y修改下启动CMD添加大内存页,开启vt-d.isolcpus参数,这个是让linux内核不去用哪个逻辑核心,我这里不让用最后2 3逻辑核心,0 1 这两个逻辑核心给linux内核用.主要是防止中断产生影响性能,这个参数不是必须的
sed -i 's/GRUB_CMDLINE_LINUX=""/GRUB_CMDLINE_LINUX="default_hugepagesz=2m hugepagesz=2m hugepages=513 iommu=pt intel_iommu=on isolcpus=2,3 "/' /etc/default/grub
update-grub恩....debian 启动脚本这个文件不存在需要建立下.在加入点东西比如 vfio模块载入,这个dpdk的vfio要用到(当然也有不需要直通的驱动可选),直通也要用到.
echo \
'#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
modprobe vfio-pci
exit 0
' \
>/etc/rc.local
chmod +x /etc/rc.local看情况在启动脚本中加入内存页预分配
sed -i '/exit 0/i echo 1500 > \/sys\/kernel\/mm\/hugepages\/hugepages-2048kB\/nr_hugepages' /etc/rc.local这里呢,我分配给了1500页面也就是3G内存,不只是虚拟交换机要用,虚拟机也要用到所以才要分配这么多,看情况自己计算
下面也是大内存页相关,直接运行就对了
mkdir -p /mnt/huge
echo "
nodev /mnt/huge hugetlbfs defaults 0 0
" >> /etc/fstab解压下源码
cd /usr/src/
xz -d dpdk-18.02.2.tar.xz
tar xvf dpdk-18.02.2.tar
tar xzvf openvswitch-2.9.0.tar.gz添加下环境变量
echo "
export DPDK_DIR=/usr/src/dpdk-stable-18.02.2
export DPDK_TARGET=x86_64-native-linuxapp-gcc
export DPDK_BUILD=\$DPDK_DIR/\$DPDK_TARGET
" >> ~/.bashrc这里重启下,让启动命令和环境变量剩下.
reboot now两个包编译下(因为楼主只有4核所以是4多了可以自己看着办)
cd /usr/src/dpdk-stable-18.02.2/
make install -j4 T=$DPDK_TARGET DESTDIR=DPDK_install
cd /usr/src/openvswitch-2.9.0/
./boot.sh
./configure --with-dpdk=$DPDK_BUILD
cd /usr/src/
make install -j4建立一些OVS相关的文件夹和文件
mkdir -p /usr/local/etc/openvswitch
ovsdb-tool create /usr/local/etc/openvswitch/conf.db /usr/local/share/openvswitch/vswitch.ovsschema
mkdir-p /usr/local/var/run/openvswitch
mkdir -p /root/log/启动它吧
ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
--remote=db:Open_vSwitch,Open_vSwitch,manager_options \
--private-key=db:Open_vSwitch,SSL,private_key \
--certificate=db:Open_vSwitch,SSL,certificate \
--bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert \
--pidfile --detach --log-file=/root/log/ovsdb-server.log
ovs-vswitchd unix:$DB_SOCK --pidfile --detach--log-file=/root/log/ovs-vswitchd.log完善下vfio权限(不重要)
chmod a+x /dev/vfio
chmod 0666 /dev/vfio/*继续添加环境变量
echo "
export PATH=$PATH:/usr/local/share/openvswitch/scripts
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
" >> ~/.bashrc退出重新登录下
exit查看下网卡的PCIID
/usr/src/dpdk-stable-18.02.2/usertools/dpdk-devbind.py --status记录ID然后绑定你的网卡,不过记住vfio下和直通规则是一样的,一个插槽下的所有设备都要绑定
楼主的网卡设备PCIID是0000:05:00.0
/usr/src/dpdk-stable-18.02.2/usertools/dpdk-devbind.py --bind=vfio-pci 0000:05:00.0or
$DPDK_DIR/usertools/dpdk-devbind.py --bind=vfio-pci 0000:05:00.0
如果你觉得vt-d限制太多插槽设备独占,你可以UIO驱动绑定,那么命令就变成这样.载入两个模块,这个应该是加入到rc.local下系统启动载入的.
modprobe uio
insmod /usr/src/dpdk-stable-18.02.2/x86_64-native-linuxapp-gcc/kmod/igb_uio.koUIO绑定
/usr/src/dpdk-stable-18.02.2/usertools/dpdk-devbind.py --bind=igb_uio 0000:05:00.0
分配内存,我这里给OVS-dpdk分配了1025内存,这是最低限度
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=1025一些参数命令
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=trueovs绑定到哪个核心分配下,这里我分配到第四个逻辑核心,这个是十六进制,怎么计算的呢?1代表要用到的核心0代表不用的核心,用二进制表示起来就是1000,你在计算器里打入这个二进制转换成十六进制就是8
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0x8可选参数,当然你也可以分配给更多核心,规则参考上面的
ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0xc
可选参数,现在版本的qemu启动也没负面影响
ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true建立虚拟交换机br0
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev把绑定vfio/uio的 dpdk物理网卡加入到虚拟交换机中
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:dpdk-devargs=0000:05:00.0这个时候你可以TOP命令之后C在按下1会发现一个核心已经满载了,这个说明启动成功
可以在这个端口上启用0拷贝和物理端口队列大小(越大延迟越高但是同时获得更高的吞吐量如此反之)
ovs-vsctl set Interface dpdk0 options:n_txq_desc=128 \
options:dq-zero-copy=true可选参数,你可以把自己的网络改成大包9000机制(要求与之链接的主机 交换机 虚拟机网卡都要改成同样的. )
ovs-vsctl -- set Interface dpdk0 mtu_request=9000设置rx队列数量,这个参数需要参考你分配给同一个插槽下面CPU的逻辑核心数量来决定 如果你pmd-cpu-mask分配给了两个逻辑核心那么就是2
ovs-vsctl set Interface dpdk0 options:n_rxq=2建立一个和KVM geust主机通讯的端口
ovs-vsctl add-port br0 vhost-user-1 -- set Interface vhost-user-1 \
type=dpdkvhostuser默认情况下生成的unix sock文件在
/usr/local/var/run/openvswitch/vhost-user-1编辑KVM XML需要参考上面这个地址
到此为止DPDK-OVS基本形态已经搭建完成.
不过还需要建立流表,没有流表大概性能只有一半吧,根据测试.代码大概是这样...需要自行翻阅文档
ovs-ofctl add-flow
可能还需要启动RSS等网卡特性
参考文档
http://docs.openvswitch.org/en/latest/intro/install/dpdk/
http://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/等...其他忘记了...
编辑KVM虚拟机XML的命令是
virsh edit <你虚拟机的名称>
KVM.XML参考代表-------------------------<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
virsh edit win7
or other application using the libvirt API.
-->
<domain type='kvm'>
<name>win7</name>
<uuid>f495616f-58c3-49ac-8b36-eef635d8bc21</uuid>
<memory unit='KiB'>1048576</memory>
<currentMemory unit='KiB'>1048576</currentMemory>
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>2</vcpu>
<cputune>
<shares>4096</shares>
<vcpupin vcpu='0' cpuset='0'/>
<vcpupin vcpu='1' cpuset='1'/>
<emulatorpin cpuset='0-1'/>
</cputune>
<os>
<type arch='x86_64' machine='pc-i440fx-2.8'>hvm</type>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state='on'/>
<vapic state='on'/>
<spinlocks state='on' retries='8191'/>
</hyperv>
<vmport state='off'/>
</features>
<cpu mode='custom' match='exact'>
<model fallback='allow'>SandyBridge</model>
<numa>
<cell id='0' cpus='0-1' memory='1048576' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
<clock offset='localtime'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
<timer name='hypervclock' present='yes'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/var/lib/libvirt/images/win7_PIP.qcow2'/>
<target dev='vda' bus='virtio'/>
<boot order='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</disk>
<controller type='usb' index='0' model='ich9-ehci1'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci1'>
<master startport='0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci2'>
<master startport='2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci3'>
<master startport='4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<interface type='vhostuser'>
<mac address='52:54:00:c8:cb:40'/>
<source type='unix' path='/usr/local/var/run/openvswitch/vhost-user-1' mode='client'/>
<model type='virtio'/>
<driver>
<host csum='off' gso='off' tso4='off' tso6='off' ecn='off' mrg_rxbuf='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<channel type='spicevmc'>
<target type='virtio' name='com.redhat.spice.0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<input type='tablet' bus='usb'>
<address type='usb' bus='0' port='1'/>
</input>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='spice' autoport='yes' listen='0.0.0.0'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<sound model='ich6'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</sound>
<video>
<model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='2'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='3'/>
</redirdev>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</memballoon>
</devices>
</domain>
红色就是添加注意的部分.蓝色是可选的大概,当然也可以参考官方的<domain type='kvm'>
<name>demovm</name>
<uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid>
<memory unit='KiB'>4194304</memory>
<currentMemory unit='KiB'>4194304</currentMemory>
<memoryBacking>
<hugepages>
<page size='2' unit='M' nodeset='0'/>
</hugepages>
</memoryBacking>
<vcpu placement='static'>2</vcpu>
<cputune>
<shares>4096</shares>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='5'/>
<emulatorpin cpuset='4,5'/>
</cputune>
<os>
<type arch='x86_64' machine='pc'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='host-model'>
<model fallback='allow'/>
<topology sockets='2' cores='1' threads='1'/>
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none'/>
<source file='/root/CentOS7_x86_64.qcow2'/>
<target dev='vda' bus='virtio'/>
</disk>
<interface type='vhostuser'>
<mac address='00:00:00:00:00:01'/>
<source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/>
<model type='virtio'/>
<driver queues='2'>
<host mrg_rxbuf='on'/>
</driver>
</interface>
<interface type='vhostuser'>
<mac address='00:00:00:00:00:02'/>
<source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/>
<model type='virtio'/>
<driver queues='2'>
<host mrg_rxbuf='on'/>
</driver>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
</devices>
</domain>教程到此结束.
看着好专业,大哥你是干什么的 从来没想过2M内存页还能以这种形式派上用场,相比4K提升多少? zatsuza 发表于 2018-7-2 03:42
从来没想过2M内存页还能以这种形式派上用场,相比4K提升多少?
不是太了解,我就折腾下就放着了 ,这里存个档.
根据我了解的部分
他这个主要是在特性上体现
虚拟机
启用了内存共享映射
memAccess ='shared'
这是hugepages独有的参数.也是这个dpdk必须的参数
可能涉及到 Zero-Copy特性
膜拜一下,lz有技术博客吗?
页:
[1]