On my linux boxes I have set up a QEMU/KVM virtual machine and given it all-but-one of the cores.
So the host has a full CPU doing the feeding to the GPU without interference.
Moo! Wrapper is about 2 times faster than if all cores are doing CPU work and the GPU is 'starving'.

I have completed 1 work unit on this project and will get back to it later, but for the moment all 22 of my cores are full of priority work.