CDlibre.org - Boletín nº 638 - 03/06/2018

CDlibre.org - Boletín nº 638 - 03/06/2018
Estos son los programas actualizados o incluidos en cdlibre.org entre el 28 de mayo y el 3 de junio de 2018.
Recuerda que las recopilaciones de junio estarán disponibles a partir del lunes 4 de junio.
Nuevos programas incluidos: Ninguno
Programas actualizados: Anki 2.0.52 - Atom 1.27.2 - AutoHotkey - Bulk Crap Uninstaller 4.4 - Calibre 3.25.0- Connectagram 1.2.7 - Django 2.0.6 - FocusWriter 1.6.13 - GeoGebra 5.0.471 - Git - GitHub Desktop 1.2.2 - Hexalate 1.1.3 - JSettlers2 1.2.01 - KMyMoney 4.8.2 - M.A.M.E. 0.198 - Node.js 10.3.0 - Packer 1.2.4 - PDFCreator 3.2.1 - Peg-E 1.2.5 - PhET 1.0 2018.05.27 - PhotoQT 1.7 - PNotes.Net - Portable Puzzle Collection 2018.06.02 - ProjectLibre 1.8 - Q4OS Windows installer 2.5 Scorpion - qBittorrent 4.1.1 - Ren'Py 7.0.0 - Rust 1.26.1 - Scratch 2.0.460.0.1 - Shotcut 18.06 - Signal 1.12.0 - Snappy Driver Installer 1.18.06 - Tanglet 1.5.2 - Telegram 1.3.0 - Tetzle 2.1.3 - TripleA - VLC media player 3.0.3 - Wings 3D 2.1.7 - wxFormBuilder 3.7.0 - Zikula 2.0.7 -
Desarrollo Web - Editores web
Atom 1.27.2 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 144.9 MB - 31/05/18 - Homepage - Descargar
Atom es un editor de código fuente desarrollado por GitHub que admite plug-ins de Node.js.
Desarrollo Web - Sistemas de gestión de contenidos (CMS)
Zikula 2.0.7 - Windows y Linux - Inglés - Licencia - F T - 34.0 MB - 28/05/18 - Homepage - Descargar
Zikula es un sistema de gestión de contenidos (CMS) basado en PHP y MySQL. Antes de junio de 2008 este programa se llamaba PostNuke.
Educativos - Física
PhET 1.0 2018.05.27 - Java - Inglés - Licencia - F T - 1276.5 MB - 27/05/18 - Homepage - Descargar
PhET es un conjunto de simulaciones animadas e interactivas para la enseñanza y el aprendizaje de la Física. El programa está en inglés, pero gran parte de las simulaciones también están traducidas al español. Esta versión incluye tanto las simulaciones como las actividades didácticas.
Educativos - Inglés
Connectagram 1.2.7 - Windows y Linux - Inglés - Licencia - F T - 9.1 MB - 29/05/18 - Homepage - Descargar
Connectagram es un juego de reordenación de palabras. El programa presenta una serie de palabras con las letras desordenadas y el objetivo es reordenarlas para formar palabras con significado. Se puede elegir el número de palabras, su longitud y su disposición en el tablero.
Tanglet 1.5.2 - Windows y Linux - Inglés - Licencia - F T - 15.0 MB - 29/05/18 - Homepage - Descargar
Tanglet es un juego para un jugador inspirado en Boggle. El programa muestra 16 letras (dispuestas en un cuadrado de 4 por 4 letras) y el jugador debe encontrar en un tiempo determinado el mayor número de palabras, combinando letras contiguas. El programa indica si las palabras son correctas y una vez terminada la partida, todas las palabras que se podían encontrar.
Educativos - Memorización
Anki 2.0.52 - Windows y Linux - Castellano - Licencia - F T - 28.2 MB - 31/05/18 - Homepage - Descargar
Anki es un programa de memorización que permite crear flashcards (que incluyan texto, imágenes, vídeos o fórmulas en LaTeX) con preguntas y respuestas.
Educativos - Programación
Scratch 2.0.460.0.1 - Windows - Castellano - Licencia - F T - 58.3 MB - 25/05/18 - Homepage - Descargar
Scratch es un lenguaje de programación dirigido a niños a partir de 8 años de edad que permite la creación de animaciones multimedia, juegos o historias interactivas mediante un entorno de programación gráfico en el que las instrucciones se manejan visualmente como piezas encajables. Los proyectos obtenidos se pueden compartir en la web del programa. Al instalarse, el programa está en inglés, pero se puede cambiar al español, catalán o gallego mediante la opción del menú "Language". Antes de instalar Scratch se necesita tener instalado Adobe Air.
Gráficos - 3D
Wings 3D 2.1.7 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 18.2 MB - 30/05/18 - Homepage - Descargar
Wings 3D es un modelador de subdivisiones en 3D.
Gráficos - Navegadores de imágenes
PhotoQT 1.7 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 32.4 MB - 21/05/18 - Homepage - Descargar
PhotoQT es un visor de imágenes sencillo.
Internet - Comunicación
Signal 1.12.0 - Windows y Linux - Castellano - Licencia - F T - 64.8 MB - 31/05/18 - Homepage - Descargar
Signal es un servicio de mensajería por Internet encriptada de extremo a extremo que permite mantener conversaciones de texto, audio o vídeo o compartir archivos, de forma segura y privada. La misma cuenta puede ser utilizada desde diferentes dispositivos, móviles o de escritorio.
Telegram 1.3.0 - Windows y Linux - Castellano - Licencia - F T - 21.5 MB - 01/06/18 - Homepage - Descargar
Telegram es un servicio de mensajería por Internet que permite mantener conversaciones de texto, audio o vídeo o compartir archivos, de forma segura y privada. La misma cuenta puede ser utilizada desde diferentes dispositivos, móviles o de escritorio.
Internet - P2P
qBittorrent 4.1.1 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 21.9 MB - 27/05/18 - Homepage - Descargar
qBittorrent es un cliente de BitTorrent con numerosas características avanzadas (búsqueda de torrents, control de trackers y descargas, creación de torrents, etc.). Al instalarse, el programa está en inglés, pero se puede cambiar al español, catalán, gallego o vasco mediante el menú Tools > Options > Behaviour > User Interface Language.
Internet - Servidores
Node.js 10.3.0 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 16.4 MB - 29/05/18 - Homepage - Descargar
Node.js es un entorno de desarrollo y servidor de aplicaciones web escritas en javaScript.
Juegos - Aventuras conversacionales
Ren'Py 7.0.0 - Windows y Linux - Inglés - Licencia - F T - 72.0 MB - 01/06/18 - Homepage - Descargar
Ren'Py es un editor de novelas visuales. El instalador realmente no instala nada, simplemente descomprime los archivos en una carpeta. Para utilizar el programa hay que ejecutar directamente el programa renpy.exe.
Juegos - Estrategia
TripleA (64 bits) - Java - Inglés - Licencia - F T - 16.5 MB - 08/03/18 - Homepage - Descargar
TripleA es un juego de estrategia por turnos, clon del juego de mesa Axis and Allies. Es un programa Java, por lo que se necesita tener instalado Java Runtime Environment.
Juegos - Los colonos de Catán
JSettlers2 1.2.01 - Java - Inglés - Licencia - F T - 910 KB - 02/06/18 - Homepage - Descargar
JSettlers2 (Java Settlers of Catan) es una versión del juego de tablero Los Colonos de Catán para jugar por Internet o localmente, contra jugadores humanos o contra la máquina. El programa también puede trabajar como servidor de partidas para otros juegadores. El programa no necesita instalación: hay que ejecutar directamente el programa descargado.
Juegos - Máquinas virtuales
M.A.M.E. 0.198 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 58.1 MB - 31/05/18 - Homepage - Descargar
M.A.M.E. (Multiple Arcade Machine Emulator) es un emulador de máquinas recreativas. El emulador no incluye ningún juego, pero desde la web del programa se pueden descargar algunos juegos en forma de ROMs. El instalador realmente no instala nada, simplemente descomprime los archivos en una carpeta. Para poner en marcha M.A.M.E. hay que ejecutar el programa mame.exe.
Juegos - Rompecabezas
Hexalate 1.1.3 - Windows y Linux - Castellano - Licencia - F T - 7.8 MB - 29/05/18 - Homepage - Descargar
Hexalate es un rompecabezas en el que se deben colocar correctamente siete piezas circulares, cada una con seis radios de seis colores distintos, de manera que el color de los radios en contacto sean iguales.
Portable Puzzle Collection 2018.06.02 - Windows y Linux - Inglés - Licencia - F T - 5.9 MB - 02/06/18 - Homepage - Descargar
Portable Puzzle Collection es una colección de 39 programas independientes, cada uno dedicado a un rompecabezas distinto: Black Box, Bridges, Cube, Dominosa, Fifteen, Filling, Flip, Flood, Galaxies, Guess, Inertia, Keen, Light Up, Loopy, Magnets, Map, Mines, Net, Netslide, Palisade, Pattern, Pearl, Pegs, Range, Rectangles, Same Game, Signpost, Singles, Sixteen, Slant, Solo, Tents, Towers, Tracks, Twiddle, Undead, Unequal, Unruly y Untangle. En cada juego se puede elegir el nivel de dificultad del rompecabezas.
Juegos - Solitarios
Peg-E 1.2.5 - Windows y Linux - Inglés - Licencia - F T - 7.7 MB - 29/05/18 - Homepage - Descargar
Peg-E es un solitario clásico, que genera tableros aleatorios de cien niveles de dificultad.
Juegos - Otros
Tetzle 2.1.3 - Windows y Linux - Inglés - Licencia - F T - 14.5 MB - 29/05/18 - Homepage - Descargar
Tetzle es un generador de rompecabezas a partir de las imágenes creadas por el usuario, con la particularidad de que las piezas del rompecabezas son tetrominós.
Matemáticas - Geometría
GeoGebra 5.0.471 - Java - Castellano - Licencia - F T - 85.8 MB - 30/05/18 - Homepage - Descargar
GeoGebra es un programa de geometría 3D dinámica en el que también se pueden incorporar ecuaciones y coordenadas directamente, lo que permite trabajar también aspectos del álgebra y el cálculo (funciones, derivadas, etc.) Es un programa Java, por lo que se necesita tener instalado Java Runtime Environment.
Ofimática - Finanzas
KMyMoney 4.8.2 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 99.8 MB - 01/06/18 - Homepage - Descargar
KMyMoney es el programa de contabilidad personal del proyecto KDE. Al instalarse, el programa está en inglés, pero se puede cambiar al español mediante el menú Preferences > KDE languages configuration. Antes de instalar KMyMoney, se necesita tener instalado MS Visual C++ 2010.
Ofimática - Gestión de proyectos
ProjectLibre 1.8 (64 bits) - Java - Castellano - Licencia - F T - 62.4 MB - 29/05/18 - Homepage - Descargar
ProjectLibre es un programa de gestión de proyectos basado en el programa OpenProj, al que ha añadido compatibilidad con MS Project 2010 y cambios en el interface. Es un programa Java, por lo que se necesita tener instalado Java Runtime Environment.
Ofimática - Libros electrónicos (ebooks)
Calibre 3.25.0 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 67.0 MB - 31/05/18 - Homepage - Descargar
Calibre es un programa de gestión de libros electrónicos que permite organizar la colección de libros, visualizar y convertir los libros en numerosos formatos y enviarlos al lector de ebooks, entre otras muchas funcionalidades.
Ofimática - PDF
PDFCreator 3.2.1 - Windows - Castellano - Licencia - F T - 32.0 MB - 31/05/18 - Homepage - Descargar
PDFCreator permite crear documentos PDF desde cualquier aplicación capaz de imprimir. PDFCreator añade un nuevo driver de impresora que, en vez de imprimir, crea los archivos PDF. Esta programa se basa en GPL Ghostscript. Al instalarlo, se recomienda elegir la opción "Ajustes de experto" en la primera pantalla del instalador y fijarse en cada paso de la instalación para no instalar aplicaciones adicionales a PDFCreator.
Ofimática - Post-it
PNotes.Net - Windows - Inglés - Licencia - F T - 4.4 MB - 01/06/18 - Homepage - Descargar
PNotes es un programa que permite crear y organizar notas de tipo post-it.
Ofimática - Procesadores de texto
FocusWriter 1.6.13 - Windows y Linux - Castellano - Licencia - F T - 33.4 MB - 29/05/18 - Homepage - Descargar
FocusWriter es un procesador de textos elemental cuya principal característica es su interface, diseñado para concentrar al usuario en el texto ocultando toda la pantalla salvo el propio texto (acercando el ratón a los bordes se recuperan los menús).
Programación - Python
Django 2.0.6 - Windows y Linux - Inglés - Licencia - F T - 7.6 MB - 01/06/18 - Homepage - Descargar
Django es un framework de desarrollo web. Para instalarlo, hay que descomprimir el archivo .tar.gz y seguir las instrucciones de instalación que se encuentran en la página web del programa.
Programación - Sistemas de control de versiones
Git (64 bits) - Windows y Linux - Inglés - Licencia - F T - 38.8 MB - 31/05/18 - Homepage - Descargar
Git es un sistema de control de versiones distribuido diseñado para ser eficiente incluso con proyectos muy grandes y que fue creado originalmente por Linus Torvalds para el desarrollo del kernel Linux.
GitHub Desktop 1.2.2 - Windows y Linux - Inglés - Licencia - F T - 77.8 MB - 30/05/18 - Homepage - Descargar
GitHub Desktop es una aplicación de esritorio para gestionar nuestros repositorios de GitHub.
Programación - wxWidgets
wxFormBuilder 3.7.0 - Windows y Linux - Inglés - Licencia - F T - 11.9 MB - 29/05/18 - Homepage - Descargar
wxFormBuilder es un generador visual de interfaces gráficas para wxWidgets.
Programación - Otros
Rust 1.26.1 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 4.4 MB - 29/05/18 - Homepage - Descargar
Rust es un lenguaje de programación de propósito general, desarrollado por Mozilla para la creación del futuro motor de renderizado de Firefox. Desde diciembre de 2016, se recomienda instalar Rust con la herramienta RustUp, que descarga y configura todos los elementos necesarios. Antes de instalar Rust se necesita tener instalado MS Visual C++ Build Tools 2015.
Utilidades - Escritorio
AutoHotkey - Windows - Inglés - Licencia - F T - 2.9 MB - 02/06/18 - Homepage - Descargar
AutoHotkey es un programa que permite crear macros y atajos de teclado (y de ratón) mediante scripts para automatizar cualquier tarea repetitiva.
Utilidades - GNU Linux
Q4OS Windows installer 2.5 Scorpion - Windows y Linux - Inglés - Licencia - F T - 5.6 MB - 03/06/18 - Homepage - Descargar
Q4OS Windows Installer es un instalador de la distribución Q4OS 2.4 Scorpion, que permite instalarla como si fuera una aplicación de Windows, sin necesidad de particionar el disco duro. Q4OS es una distribución dirigida a usuarios de Windows, basada en Debian y que utiliza el escritorio Trinity (aunque permite instalar fácilmente otros escritorios).
Utilidades - Virtualización
Packer 1.2.4 (64 bits) - Windows y Linux - Inglés - Licencia - F T - 18.0 MB - 29/05/18 - Homepage - Descargar
Packer es una herramienta de línea de comandos para crear imágenes de máquinas para muchas plataformas (EC2, Azure, VirtualBox, VMware, etc.). Además de crear la imagen, permite instalar sistema operativos y aplicaciones en las imágenes a partir de guiones. El programa no necesita instalación: hay que descomprimir el archivo zip en una carpeta y ejecutar el programa packer.exe.
Utilidades - Otros
Bulk Crap Uninstaller 4.4 - Windows - Castellano - Licencia - F T - 3.1 MB - 02/06/18 - Homepage - Descargar
Bulk Crap Unistaller es un deinstalador de aplicaciones avanzado que localiza restos de desinstalaciones anteriores, aplicaciones sin desinstaladores, etc.
Vídeo y Multimedia - Editores
Shotcut 18.06 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 184.1 MB - 02/06/18 - Homepage - Descargar
Shotcut es un editor de vídeo que permite importar, editar y grabar en numerosos formatos de vídeo y audio.
Vídeo y Multimedia - Reproductores
VLC media player 3.0.3 (64 bits) - Windows y Linux - Castellano - Licencia - F T - 39.5 MB - 31/05/18 - Homepage - Descargar
VideoLAN Media Player es un reproductor multimedia que reproduce numerosos formatos de audio y vídeo (MPEG-1, MPEG-2, MPEG-4, DivX, mp3, ogg, etc.) así como DVDs, VCDs, y varios protocolos de streaming. También puede utilizarse como servidor unicast o multicast en IPv4 o IPv6 en redes de banda ancha.
Windows - Drivers
Snappy Driver Installer 1.18.06 (32 bits y 64 bits) - Windows - Castellano - Licencia- F T - 4.0 MB - 30/05/18 - Homepage - Descargar
Snappy Driver Installer detecta, descarga e instala drivers actualizados para todo el hardware de nuestro ordenador. El programa no necesita instalación: hay que descomprimir el archivo zip en una carpeta y ejecutar el programa SDI_R1806.exe o SDI_x64_R1806.exe (en Windows de 64 bits).
Autor: Bartolomé Sintes Marco - cdlibre.org Última actualización: 3 de junio de 2018
submitted by rastavallenato to u/rastavallenato

How I achieved under 1ms DPC and under 0.25ms ISR execution times

I setup a VM that passes through my GeForce GTX 1070 and my USB 3 controller(s) for VR. I am unfortunately having some trouble with the passed through USB controllers (which I didn't in the past), but after much tweaking, I have achieved under 1ms DPC latencies and under 0.25ms ISR. I believe that the latency numbers that I am getting are unusually good, so I thought that I would share what I was doing.
Latencies were measured using latencymon. The test was done over RDP by running a speed test in Google Chrome on my gigabit internet connection:
I ran latencymon for much longer than the duration of the speedtest and kept it running while I typed this inside the VM over RDP. The initial "Highest measured interrupt to process latency" was actually under 2000 µs, but it spiked to 11168 µs while i was typing this. Here is what latencymon says:
Your system appears to be having trouble handling real-time audio and other tasks. You are likely to experience buffer underruns appearing as drop outs, clicks or pops. One problem may be related to power management, disable CPU throttling settings in Control Panel and BIOS setup. Check for BIOS updates.
LatencyMon has been analyzing your system for 1:03:45 (h:mm:ss) on processors 0 and 1.
Computer name: GAMESTATION
OS version: Windows 7 Service Pack 1, 6.1, build: 7601 (x64)
Hardware: Standard PC (Q35 + ICH9, 2009), QEMU
CPU: GenuineIntel Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
Logical processors: 4
Processor groups: 1
RAM: 8191 MB total
Reported CPU speed: 350 MHz
Measured CPU speed: 1 MHz (approx.)
Note: reported execution times may be calculated based on a fixed reported CPU speed. Disable variable speed settings like Intel Speed Step and AMD Cool N Quiet in the BIOS setup for more accurate results.
WARNING: the CPU speed that was measured is only a fraction of the CPU speed reported. Your CPUs may be throttled back due to variable speed settings and thermal issues. It is suggested that you run a utility which reports your actual CPU frequency and temperature.
The interrupt to process latency reflects the measured interval that a usermode process needed to respond to a hardware request from the moment the interrupt service routine started execution. This includes the scheduling and execution of a DPC routine, the signaling of an event and the waking up of a usermode thread from an idle wait state in response to that event.
Highest measured interrupt to process latency (µs): 11168.446455
Average measured interrupt to process latency (µs): 29.077679
Highest measured interrupt to DPC latency (µs): 662.026726
Average measured interrupt to DPC latency (µs): 12.165460
Interrupt service routines are routines installed by the OS and device drivers that execute in response to a hardware interrupt signal.
Highest ISR routine execution time (µs): 246.172571
Driver with highest ISR routine execution time: hal.dll - Hardware Abstraction Layer DLL, Microsoft Corporation
Highest reported total ISR routine time (%): 0.251875
Driver with highest ISR total time: hal.dll - Hardware Abstraction Layer DLL, Microsoft Corporation
Total time spent in ISRs (%) 0.252510
ISR count (execution time <250 µs): 3817065
ISR count (execution time 250-500 µs): 0
ISR count (execution time 500-999 µs): 0
ISR count (execution time 1000-1999 µs): 0
ISR count (execution time 2000-3999 µs): 0
ISR count (execution time >=4000 µs): 0
DPC routines are part of the interrupt servicing dispatch mechanism and disable the possibility for a process to utilize the CPU while it is interrupted until the DPC has finished execution.
Highest DPC routine execution time (µs): 964.461714
Driver with highest DPC routine execution time: ndis.sys - NDIS 6.20 driver, Microsoft Corporation
Highest reported total DPC routine time (%): 1.250334
Driver with highest DPC total execution time: rspLLL64.sys - Resplendence Latency Monitoring and Auxiliary Kernel Library, Resplendence Software Projects Sp.
Total time spent in DPCs (%) 1.378208
DPC count (execution time <250 µs): 20343661
DPC count (execution time 250-500 µs): 0
DPC count (execution time 500-999 µs): 1804
DPC count (execution time 1000-1999 µs): 0
DPC count (execution time 2000-3999 µs): 0
DPC count (execution time >=4000 µs): 0
Hard pagefaults are events that get triggered by making use of virtual memory that is not resident in RAM but backed by a memory mapped file on disk. The process of resolving the hard pagefault requires reading in the memory from disk while the process is interrupted and blocked from execution.
NOTE: some processes were hit by hard pagefaults. If these were programs producing audio, they are likely to interrupt the audio stream resulting in dropouts, clicks and pops. Check the Processes tab to see which programs were hit.
Process with highest pagefault count: chrome.exe
Total number of hard pagefaults 1719
Hard pagefault count of hardest hit process: 1163
Highest hard pagefault resolution time (µs): 192286.9160
Total time spent in hard pagefaults (%): 0.021663
Number of processes hit: 15
CPU 0 Interrupt cycle time (s): 288.031983
CPU 0 ISR highest execution time (µs): 246.172571
CPU 0 ISR total execution time (s): 38.548771
CPU 0 ISR count: 3815434
CPU 0 DPC highest execution time (µs): 722.885143
CPU 0 DPC total execution time (s): 204.434165
CPU 0 DPC count: 20173173
CPU 1 Interrupt cycle time (s): 96.247922
CPU 1 ISR highest execution time (µs): 209.685143
CPU 1 ISR total execution time (s): 0.094072
CPU 1 ISR count: 1631
CPU 1 DPC highest execution time (µs): 964.461714
CPU 1 DPC total execution time (s): 6.480125
CPU 1 DPC count: 172292
First off, my system is as follows:
Intel Xeon E5-1650v2
Supermicro X9SRL Motherboard
MSI GeForce 1070 GTX
1.2TB Intel 750 AIC SSD
Dual Supermicro PSUs
Yes, it is possible to game on server grade hardware and no, there is no performance penalty. I realize the irony of not having redundant storage, but those SSDs are expensive, so I decided to do without.
The host is a custom Gentoo Hardened system using musl libc. The kernel is 4.7.10-hardened while QEMU is version 2.11. It was an experimental system that I had working semi-decently when running SteamVR last year, but I left it alone for almost a year and recently upgraded it from a E5-1260 to a E5-1650v2 for APICv. Now here is what I did. First off, the host kernel is compiled with the following key things:
Those are the main things for performance that stand out in my mind, but it has been so long since I compiled this kernel that I have forgotten everything that I did. I suspect that the others are relatively minor though.
I am starting the host with this kernel commandline:
BOOT_IMAGE=(hd0,gpt1)/kernel-genkernel-x86_64-4.7.10-hardened root=ZFS=rpool/ROOT/gentoo ro root=ZFS=rpool/ROOT/gentoo ro doload=nvme noload=xhci_hcd scandelay=1 vfio_pci.ids=10de:10f0,10de:1b81,1b21:1242 rcu_nocbs=1-5,7-11 nohz_full=1-5,7-11 isolcpus=1-5,7-11
Sadly, the USB3 controller listed there (which is actually an addon-board rather than the on-board ports) won't attach to vfio automatically, so I need to do that manually after the system boots. The noload=xhci_hcd was an attempt at forcing that. CONFIG_KVM_VFIO=y was set when I built the kernel to get it to fairly easily grab the GPU and its audio device. What is important here performance wise is rcu_nocbs, nohz_full and isolcpus. I have a 6-core CPU, and the first core's two threads are the only thing that the host is allowed to use. The other cores are set aside for the VM.
I have the vhost_net module set to load at system boot, which is important for networking performance. It is supposed to keep the host from having to context switch to QEMU to process network IOs. I also have a few module parameters manually set:

cat /etc/modprobe.d/kvm.conf

options kvm-intel enable_apicv=Y enable_shadow_vmcs=Y nested=Y
nested=Y is not for performance. It was more of a "because I can" thing in case I ever needed/wanted to try running a VM inside my VM.
I also have this line in fstab:
hugetlbfs /mnt/hugepages hugetlbfs pagesize=1G 0 0
I have the networking configured to use a software bridge at the moment. I have a local script for OpenRC that is executed at some late point during system boot:

cat /etc/local.d/guest.start

qtap-manipulate create_specific guest
echo 8 | tee > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
This will create a tap device on the bridge that I have setup. This is not as secure as I would like (because in theory the guest could attack the host if the host has exposed services), but I have sufficient security in place that it is not an issue. That qtap-manipulate script lives in /uslocal/sbin and is from here:
After creating the tap device, the local script the system to allocate 8GB of memory for the hugetlbfs that QEMU will use for the guest.
I am starting the VM this way:
taskset --cpu-list 5,11 nice -n -20 qemu-system-x86_64 -M q35 -enable-kvm -m 8192 -smp cores=4,sockets=1 -cpu host,kvm=off,hv_time,hv_relaxed,hv_vapic,hv_vpindex,hv_reset,hv_runtime,hv_crash,hv_synic,hv_stimer,hv_spinlocks=0x1fff,hv_vendor_id=Gentoo,-hypervisor -display none -nographic -device vfio-pci,host=04:00.0,multifunction=on,x-vga=on -device vfio-pci,host=04:00.1 -device vfio-pci,host=07:00.0 -device vfio-pci,host=08:00.0 -drive file=/dev/zvol/rpool/win7,if=none,id=hd,format=raw,aio=threads,cache=none,cache.direct=on -device virtio-scsi-pci,id=scsi -device scsi-hd,drive=hd -net nic,model=virtio,macaddr=52:54:00:12:34:56 -net tap,ifname=guest,script=no,downscript=no,vhost=on -rtc clock=host,base=localtime -mem-path /mnt/hugepages -mem-prealloc -serial none -parallel none -balloon none -nodefaults -monitor stdio
Then I immediately run info cpus in the QEMU console to get the first vCPU thread id (with the others usually being in sequential order afterward) and then if for example, the first vCPU PID is 5685, I run:
(export PID=5685; chrt -a -r --pid 1 $PID; for i in {1..4}; do taskset -pc $i $(expr $PID + ( $i - 1 )); chrt -f --pid 1 $(expr $PID + ( $i - 1 )); done;)
Then I run:
for i in $(cat /proc/interrupts | grep vfio | sed -e 's/ \([0-9]\*\):\(.\*\)/\1/'); do echo 1-4 > /proc/irq/$i/smp_affinity_list; done;
What is happening here is that I am fiddling with the thread scheduler and CPU affinities. I don't think niceness even applies for realtime priority, but it is something that I had done before I started trying to fiddle with realtime stuff and I left it in the command. The exact things that I am doing are:
  1. Making the passed through devices do interrupts on only the cores servicing the VM's vCPUs to avoid unnecessary overhead from having to move the interrupt to another core.
  2. Putting QEMU on the last core's two threads with it being the only thing running on them thanks to isolcpus.
  3. Configuring the 4 vCPUs to use the 4 middle cores and be the only thing running on them thanks to isolcpus. I am not bothering with hyperthreading partly because previous tests on the older CPU showed it hurt latencies and partly because I don't feel like modifying the commands that I run to support it properly.
  4. Passing through the CPU geometry so that the guest knows that all 4 cores are part of the same socket rather than different sockets. This keeps the guest from thinking NUMA applies, which has been a slight performance penalty in my experience with various VMs in the past.
  5. Setting realtime priority on the threads. The regular QEMU threads get round robin while the vCPUs get fifo. This might be overkill on the vCPU threads because I do not know any situation where it would matter, but I am doing it there just in case. It is necessary on the QEMU threads to get them to use both threads of the last physical core due to a Linux kernel scheduler bug that prevents more than one vCPU from being used on the host when using taskset to tell the thread scheduler that it can run a thread on any of multiple that were isolated via isolcpus and the scheduler class is left at the default SCHED_OTHER.
  6. Passing through all of the capabilities of the host CPU, which includes x2apic and should force APICv. Measurements at /proc/interrupts and with latencymon seem to confirm that APICv is working for the overwhelming majority of interrupts.
  7. Setting niceness, which likely does nothing now, but I left it there.
  8. Enabling HyperV enlightenments. Some of this probably is not necessary now that I am doing the realtime stuff (and might be harmful in the case of hv_spinlocks), but I have yet to look into that.
  9. Hiding the hypervisor flag. This plus hv_vendor_id=Gentoo were originally to avoid issues with the Nvidia driver not liking being run in a VM. I have since read that it prevents Windows 7 from disabling use of the TSC, which makes it a performance feature.
  10. Forcing vhost (although I think qemu/kvm will autodetect it and make use of it)
  11. Forcing the storage to use a ZFS zvol directly without caching (for data integrity) and with aio for improved performance. The latest ZoL 0.7.x will probably do better with aio=native, but this system was first configured with ZoL 0.6.9, where aio=threads was likely better.
  12. Using huge pages to reduce overhead from nested virtual memory page tables as much as possible.
  13. Telling QEMU to preallocate memory to avoid page faults on the host as much as possible.
  14. Using Q35 and disabling as many unnecessary QEMU devices as I could.
  15. Using virtio-net and virtio-scsi for networking and storage respectively to help minimize overhead.
  16. Relying on the passed through USB controller for the mouse, keyboard and HTC Vive. This used to work until I made additional changes to the system. I need to figure out how it broke.
  17. Lastly, passing through the GPU, although that one should be obvious.
Inside the VM, I installed the latest virtio-net and virtio-scsi drivers. I also limited the network speed to 1Gbps because my network wouldn't go any faster than that and I was concerned about unnecessary bufferbloat needlessly increasing latencies. I also had manually set MSI on everything that I could and tried forcing x2apic via bcdedit for APICv, but I suspect that was unnecessary. I also disabled the Windows Firewall because I did not feel like fiddling with the Windows firewall to ensure RDP would always work. I don't think that is very important for performance. Security wise it is okay for me to do that because there is nothing else on the VLAN that the guest uses aside from the host and there is a pfSense firewall between the system and the rest of my intenal network. In the future, I plan to isolate the host via VLAN trunking, but I just haven't made time to do that. I also did performance tuning for RDP a year ago, but I don't remember the details that well aside from enabling RemoteFX, so I'll just link this:
The realtime trick of setting interrupt affinities in addition to isolcpus is one that I have not read anyone using vfio do, so I decided to write this up to share. There are a few things that I could improve slightly for better performance, but I think that I did a decent job of optimization. I definitely could make this easier on myself to start. I intend to do that eventually.
I still need to figure out the USB issue. A few times it works and most of the time it doesn't work. With the ASM1142 addin card, it seems to work only once per host boot and then the USB controller gets confused. It used to be reliable with the built-in renesas USB controllers, but it is having issues now. Windows complained about power surges on the USB ports on a previous boot, so I am not sure what is going on there. I have read about bare metal Windows systems having problems with USB detection, with it being a rare issue that Microsoft has not been able to nail down, so it might not be an issue with the virtualization itself. I imagine that the worst case is that I reinstall Windows to fix it. :/
Tests before applying the irq affinity tweaks and a few other minor things (undoing dedup in ZFS that I did as part of a separate experiment last year) showed that it got a perfect score in the SteamVR performance test. Performance in VR when the USB was working (also before applying those just mentioned tweaks) was fairly good, although not quite perfect. I suspect it is better now, but the USB issue will need to be solved first.
In hindsight, I neglected to force latencymon to monitor more than the first two cores. I only found out that I could force it to monitor all of the cores after I had already written this, but I think that the results are significant enough that I am going ahead to post it anyway.
The main reason that I virtualized Windows was so that I could use ZFS snapshots to rollback if something went wrong with Windows. It has served me rather well so far. I could have had a separate system for storage and booted Windows off iSCSI over an infiniband link to get ZFS snapshots of Windows' storage, but the VM setup is more convenient to manage, uses less physical space and uses less power.
Lastly, my apologies for the wall of text and the DIY nature of this post. I have other things that need my time, so I stopped short of making this into a guide. I expect that most people in this subreddit will be able to figure out how to copy tweaks that I have done. Keep in mind that I did admit that a few of the things such as setting niceness are artifacts of earlier optimization efforts and likely don't matter.
Edit: I forgot to mention that I had configured the host to use the performance CPU governor. This is important, so I edited the post to mention it under the kernel config options. In my case, I decided to compile the choice into the kernel (because it turns out to be the right choice in all situations), but you can use cpufreq-utils to get the same effect. I also added information on the local script that I am running at boot to do some configuration.
Edit: Long story short, I can confirm that aio=native is better for a zvol on ZoL 0.7.6. I am not surprised given that is what I expected from my knowledge as one of the people who contributed to the zvol support in ZFSOnLinux. I also think that implementing asynchronous zero copy in zvols would make them much more performant, but that is a discussion for the ZFSOnLinux issue tracker after I have patches to demonstrate what I am thinking.
submitted by ryao to VFIO