1*a9643ea8Slogwang# F-Stack Development Guide 2*a9643ea8Slogwang 3*a9643ea8SlogwangWith the rapid development of NIC, the poor performance of data packets processing with Linux kernel has become the bottleneck. However the rapid development of the Internet needs high performance of network processing, kernel bypass has caught more and more attention. There are various similar technologies appear, such as DPDK, NETMAP and PF_RING. The main idea of kernel bypass is that Linux is only used to deal with control flow, all data streams are processed in user space. Therefore kernel bypass can avoid performance bottlenecks caused by kernel packet copy, thread scheduling, system calls and interrupt. Further more, kernel bypass can achieve higher performance with multi optimizing methods. Within various techniques, DPDK has been widely used because of its more thorough isolation from kernel scheduling and active community support. 4*a9643ea8Slogwang 5*a9643ea8SlogwangF-Stack is an open source network framework with high performance based on DPDK. With follow characteristics 6*a9643ea8Slogwang 7*a9643ea8Slogwang1. Ultra high network performance which can achieve network card under full load, 10 million concurrent, five million RPS, 1 million CPS. 8*a9643ea8Slogwang2. Transplant FreeBSD 11.01 user space stack, provides a complete stack function, cut a great amount of irrelevant features. Therefore greatly enhance the performance. 9*a9643ea8Slogwang3. Support Nginx, Redis and other mature applications, service can easily use F-Stack 10*a9643ea8Slogwang4. With Multi-process architecture, easy to extend 11*a9643ea8Slogwang5. Provide micro thread interface. Various applications with long time consuming can easily use F-Stack to get high performance without processing complex asynchronous logic. 12*a9643ea8Slogwang6. Provide Epoll/kqueue interface that allow many kinds of applications easily use F-Stack 13*a9643ea8Slogwang 14*a9643ea8Slogwang## Structure of F-Stack code 15*a9643ea8Slogwang 16*a9643ea8Slogwang ├── app -- Nginx(1.11.10)/Redis(3.2.8)/Microthread framework 17*a9643ea8Slogwang ├── config.ini 18*a9643ea8Slogwang ├── doc 19*a9643ea8Slogwang ├── dpdk -- Intel DPDK(16.07) directory 20*a9643ea8Slogwang ├── example -- DEMO 21*a9643ea8Slogwang ├── freebsd -- FreeBSD(11.0) Network Stack directory 22*a9643ea8Slogwang ├── lib -- F-Stack lib directory 23*a9643ea8Slogwang ├── mk 24*a9643ea8Slogwang └── start.sh 25*a9643ea8Slogwang 26*a9643ea8Slogwang 27*a9643ea8Slogwang## DPDK initialization 28*a9643ea8Slogwang 29*a9643ea8Slogwang### PORT & SOCKET 30*a9643ea8Slogwang 31*a9643ea8SlogwangF-Stack simplify the initialization of the standard DPDK. By setting the NIC port and CPU core mask, you can set binding relationship of the port and CPU and lcore on different socket node. If there is no binding relationship set, port0 and socket node 0 will be set by default. 32*a9643ea8Slogwang 33*a9643ea8Slogwang### KNI related 34*a9643ea8Slogwang 35*a9643ea8SlogwangIf the server does not have dedicated port, or all port used for service process, you need to open the KNI in the configuration file, and set the related protocol and port number to decide which packets need to be processed by the F-Stack, remaining packets will be forwarded to kernel by KNI, to support SSH management functions. 36*a9643ea8Slogwang 37*a9643ea8Slogwang## Revise of FreeBSD Network Stack and DPDK based 38*a9643ea8Slogwang 39*a9643ea8SlogwangSince DPDK is open source, there are various open source network stacks based on DPDK to support the higher level application in the market. Some are will be packaging Linux network stack into a library, some are porting FreeBSD network stack. 40*a9643ea8Slogwang 41*a9643ea8SlogwangAt the beginning of this work, F-Stack used a simple TCP/IP stack that developed by ourselves. However, with the growth of various services, this stack couldn't meet the needs of these services while continue to develop and maintain a complete network stack will cost high. So the FreeBSD network stack was ported into F-Stack. The FreeBSD network stack provides complete features and can follow up the improvement from the community. Thanks to [libplebnet](https://gitorious.org/freebsd/kmm-sandbox/commit/fa8a11970bc0ed092692736f175925766bebf6af?p=freebsd:kmm-sandbox.git;a=tree;f=lib/libplebnet;h=ae446dba0b4f8593b69b339ea667e12d5b709cfb;hb=refs/heads/work/svn_trunk_libplebnet) and [libuinet](https://github.com/pkelsey/libuinet), this work becomes a lot easier. 42*a9643ea8Slogwang 43*a9643ea8SlogwangIn order to minimize the impact of resource sharing and kernel system (such as scheduling, locks, etc.) on the performance, F-Stack uses a multi-process architecture. Following are the changes to the FreeBSD network stack. 44*a9643ea8Slogwang 45*a9643ea8Slogwang### Scheduling 46*a9643ea8Slogwang 47*a9643ea8SlogwangCut kernel thread, interrupt thread, timer thread, sched, wakeup, sleep, etc of FreeBSD Network Stack 48*a9643ea8Slogwang 49*a9643ea8Slogwang### Lock 50*a9643ea8Slogwang 51*a9643ea8SlogwangCut lock operations of FreeBSD Network Stack, including mtx、rw、rm、sx、cond, etc. 52*a9643ea8Slogwang 53*a9643ea8Slogwang### Memory related 54*a9643ea8Slogwang 55*a9643ea8SlogwangUsing phymem, uma\_page\_slab\_hash, uma initialization, kmem_malloc malloc 56*a9643ea8Slogwang 57*a9643ea8Slogwang### Global variables 58*a9643ea8Slogwang 59*a9643ea8Slogwangpcpu curthread proc0 thread0, initialization 60*a9643ea8Slogwang 61*a9643ea8Slogwang### Environment variable 62*a9643ea8Slogwang 63*a9643ea8Slogwangsetenv getenv 64*a9643ea8Slogwang 65*a9643ea8Slogwang### SYS_INIT 66*a9643ea8Slogwang 67*a9643ea8Slogwangmi_startup 68*a9643ea8Slogwang 69*a9643ea8Slogwang### Clock 70*a9643ea8Slogwang 71*a9643ea8Slogwangtimecounter, ticks, hz, timer 72*a9643ea8Slogwang 73*a9643ea8Slogwang### Other 74*a9643ea8Slogwang 75*a9643ea8SlogwangLinux and freebsd errno conversion, glue code, Remove unnecessary modules 76*a9643ea8Slogwang 77*a9643ea8Slogwang## Applications use F-Stack 78*a9643ea8Slogwang 79*a9643ea8SlogwangF-Stack provides ff API (See *F-Stack\_API\_Reference*) to support applications. F-Stack also integrates third-party application such as Nginx, Redis, etc and. Micro thread interface is also provided to help original application easily use F-Stack. 80*a9643ea8Slogwang 81*a9643ea8Slogwang### Web application 82*a9643ea8Slogwang 83*a9643ea8SlogwangHTTP web application can use F-Stack with Nginx. 84*a9643ea8Slogwang 85*a9643ea8Slogwang### key-value application 86*a9643ea8Slogwang 87*a9643ea8Slogwangkey-value db application can use F-Stack with redis, and can start multi Redis instance. 88*a9643ea8Slogwang 89*a9643ea8Slogwang### Stateful(High latency) applications 90*a9643ea8Slogwang 91*a9643ea8SlogwangApplications with stateful(high latency) use F-Stack , state need to be stored for a long time, can directly use the F-Stack micro threading framework. Applications only need to focus on with the service logic. And with synchronous programming, high performance asynchronous service server can be achieved. 92*a9643ea8Slogwang 93*a9643ea8Slogwang## F-Stack configure file reference 94*a9643ea8Slogwang 95*a9643ea8Slogwang DPDK related parameters, including coremask adn NIC ports num 96*a9643ea8Slogwang 97*a9643ea8Slogwang [dpdk] 98*a9643ea8Slogwang lcore_mask=3 99*a9643ea8Slogwang ## Port mask, enable and disable ports. 100*a9643ea8Slogwang ## Default: all ports are enabled. 101*a9643ea8Slogwang #port_mask=1 102*a9643ea8Slogwang channel=4 103*a9643ea8Slogwang nb_ports=1 104*a9643ea8Slogwang promiscuous=1 105*a9643ea8Slogwang numa_on=1 106*a9643ea8Slogwang 107*a9643ea8Slogwang [port0] 108*a9643ea8Slogwang addr=192.168.1.2 109*a9643ea8Slogwang netmask=255.255.255.0 110*a9643ea8Slogwang broadcast=192.168.1.255 111*a9643ea8Slogwang gateway=192.168.1.1 112*a9643ea8Slogwang 113*a9643ea8Slogwang ## Packet capture path, this will hurt performance 114*a9643ea8Slogwang #pcap=./a.pcap 115*a9643ea8Slogwang 116*a9643ea8Slogwang ## Kni config: if enabled and method=reject, 117*a9643ea8Slogwang ## all packets that do not belong to the following tcp_port and udp_port 118*a9643ea8Slogwang ## will transmit to kernel; if method=accept, all packets that belong to 119*a9643ea8Slogwang ## the following tcp_port and udp_port will transmit to kernel. 120*a9643ea8Slogwang #[kni] 121*a9643ea8Slogwang #enable=1 122*a9643ea8Slogwang #method=reject 123*a9643ea8Slogwang #tcp_port=80 124*a9643ea8Slogwang #udp_port=53 125*a9643ea8Slogwang 126*a9643ea8Slogwang # log is invalid 127*a9643ea8Slogwang [log] 128*a9643ea8Slogwang level=1 129*a9643ea8Slogwang dir=/var/log 130*a9643ea8Slogwang 131*a9643ea8Slogwang ## FreeBSD network performance tuning configurations. 132*a9643ea8Slogwang ## Most native FreeBSD configurations are supported. 133*a9643ea8Slogwang [freebsd.boot] 134*a9643ea8Slogwang hz=100 135*a9643ea8Slogwang 136*a9643ea8Slogwang kern.ipc.maxsockets=262144 137*a9643ea8Slogwang 138*a9643ea8Slogwang net.inet.tcp.syncache.hashsize=4096 139*a9643ea8Slogwang net.inet.tcp.syncache.bucketlimit=100 140*a9643ea8Slogwang 141*a9643ea8Slogwang net.inet.tcp.tcbhashsize=65536 142*a9643ea8Slogwang 143*a9643ea8Slogwang [freebsd.sysctl] 144*a9643ea8Slogwang kern.ipc.somaxconn=32768 145*a9643ea8Slogwang kern.ipc.maxsockbuf=16777216 146*a9643ea8Slogwang 147*a9643ea8Slogwang net.inet.tcp.fast_finwait2_recycle=1 148*a9643ea8Slogwang net.inet.tcp.sendspace=16384 149*a9643ea8Slogwang net.inet.tcp.recvspace=8192 150*a9643ea8Slogwang net.inet.tcp.nolocaltimewait=1 151*a9643ea8Slogwang net.inet.tcp.cc.algorithm=htcp 152*a9643ea8Slogwang net.inet.tcp.sendbuf_max=16777216 153*a9643ea8Slogwang net.inet.tcp.recvbuf_max=16777216 154*a9643ea8Slogwang net.inet.tcp.sendbuf_auto=1 155*a9643ea8Slogwang net.inet.tcp.recvbuf_auto=1 156*a9643ea8Slogwang net.inet.tcp.sendbuf_inc=16384 157*a9643ea8Slogwang net.inet.tcp.recvbuf_inc=524288 158*a9643ea8Slogwang net.inet.tcp.inflight.enable=0 159*a9643ea8Slogwang net.inet.tcp.sack=1 160*a9643ea8Slogwang net.inet.tcp.blackhole=1 161*a9643ea8Slogwang net.inet.tcp.msl=2000 162*a9643ea8Slogwang net.inet.tcp.delayed_ack=0 163*a9643ea8Slogwang 164*a9643ea8Slogwang net.inet.udp.blackhole=1 165*a9643ea8Slogwang net.inet.ip.redirect=0 166*a9643ea8Slogwang 167*a9643ea8Slogwang## F-Stack Application Start 168*a9643ea8Slogwang 169*a9643ea8SlogwangF-Stack use a multi process architecture to remove resource sharing. There are some attentions for start of application dock with F-Stack. We take the example of start.sh under F-Stack root directory. 170*a9643ea8Slogwang 171*a9643ea8Slogwang #!/bin/bash 172*a9643ea8Slogwang 173*a9643ea8Slogwang function usage() { 174*a9643ea8Slogwang echo "F-Stack app start tool" 175*a9643ea8Slogwang echo "Options:" 176*a9643ea8Slogwang echo " -c [conf] Path of config file" 177*a9643ea8Slogwang echo " -b [N] Path of binary" 178*a9643ea8Slogwang echo " -h show this help" 179*a9643ea8Slogwang exit 180*a9643ea8Slogwang } 181*a9643ea8Slogwang 182*a9643ea8Slogwang conf=config.ini 183*a9643ea8Slogwang bin=./helloword 184*a9643ea8Slogwang 185*a9643ea8Slogwang while getopts "c:b:h" args 186*a9643ea8Slogwang do 187*a9643ea8Slogwang case $args in 188*a9643ea8Slogwang c) 189*a9643ea8Slogwang conf=$OPTARG 190*a9643ea8Slogwang ;; 191*a9643ea8Slogwang b) 192*a9643ea8Slogwang bin=$OPTARG 193*a9643ea8Slogwang ;; 194*a9643ea8Slogwang h) 195*a9643ea8Slogwang usage 196*a9643ea8Slogwang exit 0 197*a9643ea8Slogwang ;; 198*a9643ea8Slogwang esac 199*a9643ea8Slogwang done 200*a9643ea8Slogwang 201*a9643ea8Slogwang allcmask0x=`cat ${conf}|grep lcore_mask|awk -F '=' '{print $2}'` 202*a9643ea8Slogwang ((allcmask=16#$allcmask0x)) 203*a9643ea8Slogwang 204*a9643ea8Slogwang # match coremask actual number of CPU core, and calculate the specified startup parameters of all processes, including 205*a9643ea8Slogwang # -c coremask,The coremask parameters and the actual number of CPU core match, and calculate the specific startup parameters of all processes, including 206*a9643ea8Slogwang # --proc-type=primary/secondary 207*a9643ea8Slogwang # --num-procs = number of process 208*a9643ea8Slogwang # --proc-id = current process ID, increase from 0 209*a9643ea8Slogwang num_procs=0 210*a9643ea8Slogwang PROCESSOR=$(grep 'processor' /proc/cpuinfo |sort |uniq |wc -l) 211*a9643ea8Slogwang for((i=0;i<${PROCESSOR};++i)) 212*a9643ea8Slogwang do 213*a9643ea8Slogwang mask=`echo "2^$i"|bc` 214*a9643ea8Slogwang ((result=${allcmask} & ${mask})) 215*a9643ea8Slogwang if [ ${result} != 0 ] 216*a9643ea8Slogwang then 217*a9643ea8Slogwang ((num_procs++)); 218*a9643ea8Slogwang cpuinfo[$i]=1 219*a9643ea8Slogwang else 220*a9643ea8Slogwang cpuinfo[$i]=0 221*a9643ea8Slogwang fi 222*a9643ea8Slogwang done 223*a9643ea8Slogwang proc_id=0 224*a9643ea8Slogwang for((i=0;i<${PROCESSOR};++i)) 225*a9643ea8Slogwang do 226*a9643ea8Slogwang if ((cpuinfo[$i] == 1)) 227*a9643ea8Slogwang then 228*a9643ea8Slogwang cmask=`echo "2^$i"|bc` 229*a9643ea8Slogwang cmask=`echo "obase=16;${cmask}"|bc` 230*a9643ea8Slogwang if ((proc_id == 0)) 231*a9643ea8Slogwang then 232*a9643ea8Slogwang #echo "${bin} config.ini -c $cmask --proc-type=primary --num-procs=${num_procs} --proc-id=${proc_id}" 233*a9643ea8Slogwang ${bin} config.ini -c ${cmask} --proc-type=primary --num-procs=${num_procs} --proc-id=${proc_id} & 234*a9643ea8Slogwang sleep 5 235*a9643ea8Slogwang else 236*a9643ea8Slogwang #echo "${bin} config.ini -c $cmask --proc-type=secondary --num-procs=${num_procs} --proc-id=${proc_id}" 237*a9643ea8Slogwang ${bin} config.ini -c $cmask --proc-type=secondary --num-procs=${num_procs} --proc-id=${proc_id} & 238*a9643ea8Slogwang fi 239*a9643ea8Slogwang ((proc_id++)) 240*a9643ea8Slogwang fi 241*a9643ea8Slogwang done 242