Is it appropriate to use haveged as a source of entropy on virtual machines?

  • While looking for solutions to entropy pool depletion on virtual machines, I came across an interesting project called haveged, which is based on the HAVEGE algorithm (HArdware Volatile Entropy Gathering and Expansion). It makes a pretty fantastic claim.

    HAVEGE is a random number generator that exploits the modifications of the internal CPU hardware states (caches, branch predictors, TLBs) as a source of uncertainty. During an initialization phase, the hardware clock cycle counter of the processor is used to gather part of this entropy: tens of thousands of unpredictable bits can be gathered per operating system call in average.

    If this really produces nearly unlimited high-quality entropy on headless virtual machines, it should be included in every server distribution by default! And yet, some people have raised concerns.

    "At its heart, [HAVEGE] uses timing information based on the processor's high resolution timer (the RDTSC instruction). This instruction can be virtualized, and some virtual machine hosts have chosen to disable this instruction, returning 0s or predictable results."
    (Source: PolarSSL Security Advisory 2011-02 on polarssl.org).

    And furthermore, popular NIST and ENT tests will sometimes give haveged a PASS even when it's intentionally mis-configured, and not actually producing random numbers!

    I replaced the “HARDTICKS” macro in HAVEGE with the constant 0 (zero) rather than reading the time stamp counter of the processor. This immediately failed the randomness test. However, when I used the constant 1 (one) instead, the ent test passed. And even nist almost passed with only a single missed test out of the 426 tests executed. (Source: Evaluating HAVEGE Randomness on engbloms.se).

    • So, which virtualization platforms/hypervisors are safe to use with haveged in a virtual machine?
    • And is there a generally accepted best practice way to test whether a source of randomness is producing sufficiently high quality numbers?

    haveged is included in most repos, though some versions may be quite old. Version 1.5 introduced AIS31 testing.

  • Tom Leek

    Tom Leek Correct answer

    8 years ago

    (Caveat: I certainly don't claim that HAVEGE lives up to its claims. I have not checked their theory or implementation.)

    To get randomness, HAVEGE and similar systems feed on "physical events", and in particular on the timing of physical events. Such events include occurrences of hardware interrupts (which, in turn, gathers data about key strokes, mouse movements, incoming ethernet packets, time for a hard disk to complete a write request...). HAVEGE also claims to feed on all the types of cache misses which occur in a CPU (L1 cache, L2 cache, TLB, branch prediction...); the behaviour of these elements depends on what the CPU has been doing in the previous few thousands clock cycles, so there is potential for some "randomness". This hinges on the possibility to measure current time with great precision (not necessarily accuracy), which is where the rdtsc instruction comes into play. It returns the current contents of an internal counter which is incremented at each clock cycle, so it offers sub-nanosecond precision.

    For a virtual machine system, there are three choices with regards to this instruction:

    1. Let the instruction go to the hardware directly.
    2. Trap the instruction and emulate it.
    3. Disable the instruction altogether.

    If the VM manager chooses the first solution, then rdtsc has all the needed precision, and should work as well as if it was on a physical machine, for the purpose of gathering entropy from hardware events. However, since this is a virtual machine, it is an application on the host system; it does not get the CPU all the time. From the point of view of the guest operating system using rdtsc, this looks as if its CPU was "stolen" occasionally: two successive rdtsc instructions, nominally separated by a single clock cycles, may report an increase of the counter by several millions. In short words, when rdtsc is simply applied on the hardware, then the guest OS can use it to detect the presence of an hypervisor.

    The second solution is meant to make the emulation more "perfect" by maintaining a virtual per-VM cycle counter, which keeps track of the cycles really allocated to that VM. The upside is that rdtsc, from the point of view of the guest, will no longer exhibit the "stolen cycles" effect. The downside is that this emulation is performed through triggering and trapping a CPU exception, raising the cost of the rdtsc opcode from a few dozen clock cycles (it depends on the CPU brand; some execute rdtsc in less than 10 cycles, other use 60 or 70 cycles) to more than one thousand of cycles. If the guest tries to do a lot of rdtsc (as HAVEGE will be prone to do), then it will slow down to a crawl. Moreover, the exception handling code will disrupt the measure; instead of measuring the hardware event timing, the code will measure the execution time of the exception handler, which can conceivably lower the quality of the extracted randomness.

    The third solution (disabling rdtsc) will simply prevent HAVEGE from returning good randomness. Since it internally uses a PRNG, the output may still fool statistical analysis tools, because there is a huge difference between "looking random" and "being unpredictable" (statistical analysis tools follow the "look random" path, but cryptographic security relies on unpredictability).

    The VirtualBox manual claims that VirtualBox, by default, follows the first method (rdtsc is unconditionally allowed and applied on the hardware directly), but may be configured to apply the second solution (which you don't want, in this case).

    To test what your VM does, you can try this small program (compile with gcc -W -Wall -O on Linux; the -O is important):

    #include <stdio.h>
    
    #if defined(__i386__)
    
    static __inline__ unsigned long long rdtsc(void)
    {
            unsigned long long int x;
    
            __asm__ __volatile__ (".byte 0x0f, 0x31" : "=A" (x));
            return x;
    }
    
    #elif defined(__x86_64__)
    
    static __inline__ unsigned long long rdtsc(void)
    {
            unsigned hi, lo;
    
            __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
            return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
    }
    
    #endif
    
    int
    main(void)
    {
            long i;
            unsigned long long d;
    
            d = 0;
            for (i = 0; i < 1000000; i ++) {
                    unsigned long long b, e;
    
                    b = rdtsc();
                    e = rdtsc();
                    d += e - b;
            }
            printf("average : %.3f\n", (double)d / 1000000.0);
            return 0;
    }
    

    On a non-virtual machine, with the "true" rdtsc, this shall report a value between 10 and 100, depending on the CPU brand. If the reported value is 0, or if the program crashes, then rdtsc is disabled. If the value is in the thousands, then rdtsc is emulated, which means that the entropy gathering may not work as well as expected.

    Note that even getting a value between 10 and 100 is not a guarantee that rdtsc is not emulated, because the VM manager, while maintaining its virtual counter, may subtract from it the expected time needed for execution of the exception handler. Ultimately, you really need to have a good look at the manual and configuration of your VM manager.


    Of course, the whole premise of HAVEGE is questionable. For any practical security, you need a few "real random" bits, no more than 200, which you use as seed in a cryptographically secure PRNG. The PRNG will produce gigabytes of pseudo-alea indistinguishable from true randomness, and that's good enough for all practical purposes.

    Insisting on going back to the hardware for every bit looks like yet another outbreak of that flawed idea which sees entropy as a kind of gasoline, which you burn up when you look at it.

    Recent versions of haveged implement run-time testing of the output using an adaptation of the AIS31, the test suite used by the German Common Criteria body to certify smart cards, see http://www.issihosts.com/haveged/ais31.html With a constant value in place of the high resolution timer, haveged may behave like a long sequence prng (or not depending on the details of the bit pattern). If the output passes the continuous tests, how does that matter? If anything, that argues that the particulars of the timer source are not as critical as one might think.

    @gwuertz: if the output passes the continuous tests even when its entropy source has been fixed (no measure at all !), then this means that the tests are not adequate to test for cryptographic strength -- as is expected, since such tests are for statistical patterns, not for unpredictability. If you talk a strong crypto-grade PRNG and seed it with a fixed 0, you will also pass all such tests with success, and your security will nonetheless be nil.

    I may also add that I have some reservations about trusting people for implementing a PRNG, when they show that their framework for testing their results is statistical tests, which, by definition, are terrible for cryptographic usages. This does not bode well.

    @TomLeek is there an alternative? I always thought there is no proof for unpredictability itself.

    Cryptographers use detailed and thorough peer review to get some assurance that their design is not obviously bad. Now that's not really great. My point, though, is that statistical tests are terrible at evaluating security. Such tests can only detect extremely poor and weak PRNG. It is easy to make a PRNG which will pass all such tests (hey, even a LFSR can do it). From a cryptographer's point of view, it is not even worth reporting. To make an analogy: it looks like a car manufacturer who publishes big ads stating that his cars are great because he has ve-ri-fied that they _have wheels_.

    The instruction may run with sub-nanosecond precision on its own, but remember that the process HAVEGE is running in will only execute during specific, predictable and deterministic timeslices.

    Regarding the premise being questionable, remember that that's just the `haveged` program which has odd behavior, not the library itself. The library is actually pretty popular in some embedded devices, e.g. mbedtls uses libhavege if a system entropy source is not present, and it only generates the amount of random data that is requested from it (e.g. for key operations).

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM