An In Depth Look into Random Number Generation in .NET



  ·  
1 July 2021  
  ·  
 35 min read

In this article we’ll explore thoroughly how pseudo-random number generators (PRNG) work, both in general and in the specific context to the various implementations found in the .NET Base Class Libraries (BCL). We’ll also look at the background behind some of the popular PRNG algorithms.

Working with the .NET System.Random class is, for a majority of usage scenarios, the appropriate “go-to” choice that developers should use. There are very minor nuances to be mindful of, but working with this class is mostly straightforward.

For those development scenarios that are specifically related to cryptography and security, one should use the System.Security.Cryptography.RandomNumberGenerator class to generate random values. This article will not be exploring that class beyond basic usage and implementation.

With the introduction of .NET Core 2.0, the internal implementation of the System.Random class was revised slightly compared to the legacy .NET Framework. From a usage perspective, things remained virtually unchanged - but there are some minor nuances that may help to be aware of.

Most recently, with the upcoming .NET 6 (still in preview at the time of writing), the implementation of the Random class has been further revised. Part of the revision includes a change to the underlying default PRNG algorithm that is now used in the Random class, along with changes to the cryptographically strong RNG libraries.

For a vast majority of developers, it’s entirely fair and reasonable to dismiss these revisions to the BCL - along with entire sections of this very article - as being “just about implementation details that I don’t need to know about”. A majority of this blog is intended for a different audience - those who are interested in how stuff works under the hood. This one is meant to be a complete geek-out. ;-)

Note: There has been a great deal of information covered in this article. I want to be clear that I am not an authoritative expert, so I’ve tried hard to validate everything stated here with links and references. As always, if you’re reading something here that you know is glaringly wrong, drop me a line and I’ll correct it. It helps nobody to propagate misinformation! Find me on Twitter @siliconorchid



Contents

This is a long article that covers the following:



Basic usage of the various .NET Random Number Generator classes

This section is intended for the benefit of readers who are completely new to C#, or those in a hurry and just want to “cut to the chase” with quick code samples.



Basic usage of the .NET System.Random class

The following simple console app example uses System.Random to display a random integer, in the range of 0 to 99:

using System;

namespace BasicRandomNumberConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            var randomGenerator = new Random();
            int randomInt = randomGenerator.Next(100);
            Console.WriteLine($"{randomInt}");
        }
    }
}

In the above example, the .Next method returns integers. If we would like to return other types, we could use NextBytes() or NextDouble().

Note: specifying .Next(100) will never return a value 100!



Basic usage of the .NET Core (onwards) System.Security.Cryptography.RandomNumberGenerator class

using System;
using System.Security.Cryptography;

namespace BasicCryptoRandomNumberConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            int randomInt = RandomNumberGenerator.GetInt32(0, 100);
            Console.WriteLine($"{randomInt}");
        }
    }
}

Note: Don’t choose this class over System.Random unless you appreciate why and how it should be used!

Note: Cryptographic libraries were not originally supplied with the original .NET Core 1.x, only becoming available with .NET Core 2.x onwards.

Note: .NET Core 3.x introduced a revision that included a class for creating cryptographically strong random numbers, that removed bias that was present in .NET Core 2.x



Basic usage of the legacy .NET System.Security.Cryptography.RNGCryptoServiceProvider class

The following console app generates a cryptographically strong random integer and displays it in the console:

using System;
using System.Security.Cryptography;

namespace SuperBasicRNGCryptoRandomNumberConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            RNGCryptoServiceProvider rngCryptoServiceProvider = new RNGCryptoServiceProvider();
            byte[] integerBytes = new byte[4];
            int randomInt;

            rngCryptoServiceProvider.GetBytes(integerBytes);
            randomInt = BitConverter.ToInt32(integerBytes, 0);

            Console.WriteLine($"{randomInt}");
        }
    }
}

Note: Don’t choose this class over System.Random unless you appreciate why and how it should be used!

Note: This class is for use with legacy .NET; for .NET Core onwards use the RandomNumberGenerator class. Also, be aware that this class is about to become deprecated in .NET 6.x. For information, check out the section at the end of this article.



image of rouletta hole

How do computers generate random numbers?

TLDR; We can exploit system hardware to generate random numbers that are good enough for use with cryptography/security. However, doing it this way is slow. Unless we need to, we use an algorithm to generate numbers that are “good enough for most uses” as it’s much more efficient.



What’s the issue with computers generating random numbers?

If we search online into the topic of “computers generating random numbers”, it won’t take very long to stumble across a heated online discussion along the following lines:

  • “ahah! - there is no such thing as truly random!”
  • ”… consider a gaming dice - it may be imperfectly manufactured or subject to how it is thrown”.
  • “the only truly random number generation comes from quantum physics”

All of this is literally correct - but for practical purposes, we really don’t need to get distracted by these discussions.

What is appropriate for us, is to just acknowledge that we realise that our computers aren’t perfect at generating random numbers - but they are acceptably good enough for a vast majority of applications (and anything we are likely to create using .NET).



Can computer hardware be used to generate random numbers?

Primary reference material:

Today, the way that our mainstream contemporary computers generate sequences of numbers, that are considered random enough for use in security and cryptography, is to utilise various unique aspects of the operating system and system hardware.

To be effective at generating random numbers, the operating system needs to introduce as much entropy as possible.

According to a whitepaper by Microsoft, The Windows 10 random number generation infrastructure, Windows uses the following items for entropy (the following list is an extraction from that document - refer to the whitepaper if you are interested in fully detailed descriptions):

  • Interrupt timings
  • Startup (state of interrupts at startup)
  • Analysis (of raw Timestamp data)
  • TPM
  • RDRAND/RDSEED (intel processor instructions)
  • Seed File (a registry entry created during startup)
  • External entropy (another registry entry, created during OS setup process)
  • ACPI-OEM0 (related to hypervisors and power management)
  • Firmware data (enumerates and creates hashes from various firmware tables)
  • UEFI Protocol (call to UEFI for entropy from chipset)
  • Time at startup
  • Virtual machine rewind (related to the risk of a hypervisor replaying a machine image)

This functionality is exposed via an API provided by the operating system. In .NET we use something called an “Interop” which provides a wrapper for accessing these APIs on different host operating systems.

The problem with making calls to hardware is that relatively speaking, it’s really slow.

Generating the occasional number may not be an concern, but frequently generating sequences of random numbers is likely to cause performance problems.

It’s because of this that we need to adopt a different approach to generate sequences of random numbers.

Note: When I’m referring to “OS APIs”, I’m trying to use language that I think is cross-platform agnostic. In this example, if we were running on Windows, the wrapper would ultimately be using a Windows RPC.

Note: Regarding “RDRAND” : The MS whitepaper identifies the instruction as “Intel RDRAND”. This may possibly leave us with the impression that this is only available on Intel CPUs. That isn’t the case - according to wikipedia and this technical doc AMD Random Number Generator it seems to be identified as an x86 instruction which is also supported on recent AMD processors.

Note: According to a comment near the bottom of a 2019 blog by Microsoft’s Stephen Toub and Shawn Farkas Tales from the CryptoRandom, based on non-scientific tests, they reckoned that using System.Random was over 100 times faster than System.Security.Cryptography.RNGCryptoServiceProvider

Trivia: In 1957, the UK government used a system called ERNIE (Electronic Random Number Indicator Equipment) which used random noise to generate numbers. Its practical application was to create Premium Bond numbers. The original system used thermal noise - a modern revision uses technology that reads quantum noise.

Blog Update 6 July: As a quick post-publish update, I've just found an excellent article by Khalid Abuhakmeh : Creating Random Numbers With .NET Core where he provides code to test the difference in speed between `Random ` and `RandomNumberGenerator ` classes. In his tests he shows a difference of 0.4052 ms compared to 90.7277 ms.



What is a deterministic Pseudo-Random Number Generator (PRNG)?

Instead of placing multiple calls to the hardware, we can greatly improving performance by using algorithms to procedurally generate sequences of numbers that are “good enough for most uses”.

These algorithms are called Pseudo-random number generators (PRNG).

As a regular software developer, it’s entirely reasonable for us to consider that the random number sequences generated by these algorithms have “effective randomness of good quality”. We’ll explore what that actually means in the next section.

Something to consider with any PRNG algorithm is that, at the end of the day, they are still “just an algorithm”. To put it simply, “if we feed a known value into an algorithm input, we can always expect the same result(s) as output”.

The consequence of this is that a sequence of numbers produced by a PRNG algorithm will always be the same. We call this a “deterministic result”.

As developers, if we’re working on part of an application involving any form of cryptography or security, being able to “guess the next number” is a huge problem.

  • If we’re not working on part of the application where cryptography/security is relevant, stick to using a fast algorithmic PRNG.

    • In the context of .NET specifically - this is exactly what the System.Random class provides for us.
  • For those scenarios where cryptography/security is applicable, we instead accept the performance hit and use an appropriate API to generate each number in a sequence, by placing calls to the OS/hardware.

    • In the context of .NET specifically - this is what the System.Security.Cryptography.RNGCryptoServiceProvider class provides for us.



What makes a good PRNG?

Different PRNG algorithms have been devised over a period of several decades by mathematicians and computer scientists.

For example, the algorithm that underpins the implementation of the legacy .NET Framework version of the System.Random class has roots in computer science dating back to the 1960s.

Because of this, there is variety in the “quality” and efficiency. Newer algorithms are finding ways to make improvements.

Disregarding efficiency for a moment, how do we qualify “a PRNG that produces good quality numbers”?

Informed by the article Cryptomathid : Generating Cryptographic Keys: Will Your Random Number Generators (PRNGs) Do The Job? , properties of a good PRNG include:

  • Long period - we don’t want to see patterns of repeating numbers.
  • Uniformity - the spread of numbers within the range should be uniform - we don’t want to see a predominant cluster of numbers.
  • Uncorrelated Sequences - by analysing the numbers presented in a single overall sequence, we shouldn’t be able to correlate a sequence of numbers within one section [of that overall sequence] with numbers within a section found elsewhere [of that overall sequence]
  • Computational Indistinguishability - when presented with two or more overall sequences of numbers, if we were to take a subset from one of those sequences, we should not be able to identify from which sequence that subset originated.



image of seedling

What is a PRNG Seed Number?

A random number generator that literally produced the same sequence of numbers, every time, for each and everyone who used it, would be the worst invention in computing history - ever! ;-)

So, whilst using the word “obviously” isn’t a terribly inclusive term, I’m sure you’ll agree that on this occasion it’s fair to say “that’s obviously not how a PRNG works!”

PRNG algorithms rely on something called a “seed number” to begin the sequence.

From that starting “seed” number, each iteration of the algorithm loop creates a new number that is derived from the preceding number in the sequence.

Therefore, if we know the “seed number”, we can always know what the next numbers in the sequence will be.



How are PRNG Seeded in .NET?

In .NET, we have the option to manually specify a seed number in the class constructor.

If this initially seems like an odd thing to do, there will be scenarios such as unit testing where knowing the numbers in a sequence will be desirable.

However, the default option is to not specify a seed value, meaning that the Random class will go ahead and automatically create a seed value for us.

This is where things get a bit interesting; If you’ve ever been advised by a seasoned .NET developer that this seed generation is “just a derivative of the system timer” … that advice may not be applicable or current:

  • Historically, for the legacy .NET Framework, that advice is good - the seeding is indeed driven primarily by just the system timer.
  • However, since .NET Core 2.x, the default seed values are generated by the OS - those same hardware-generated, cryptographically-strong, random numbers that we talked about earlier.

Reading this, some readers may possibly raise this question:

  • Q: “So, if the seed value in recent .NET versions is cryptographically-strong - does this make System.Random safe for use in cryptography/security?“.
  • A: “No! Do not use System.Random for cryptography/security purposes. Although the seeding value itself may be a “cryptographically-strong” random value, the resultant sequence of numbers produced by an algorithm, will still be deterministic, making it unsuitable.

In the next section, we’ll expand on this further by digging into the actual implementations of PRNG seeding in the various .NET versions.

Suggestion: If you’re referring to the official MS documentation of the Random class, I recommend paying particularly close attention to where the document authors are explicitly referring to the legacy .NET Framework. When I first started looking at the docs, I misread them - and ended up incorrectly assuming that some of the advice was applicable to all versions of .NET.



Random number seed generation for the legacy .NET Framework

According to the MS docs at the time of writing “In .NET Framework, the default seed value is time-dependent.”

As we mentioned in the previous section, behind the scenes, the implementation uses the system timer to generate a seed value, based upon “ticks from startup”.

Because the .NET Framework Timer has finite granularity - reported to have a resolution of ~15ms - this creates a potential risk of spawning multiple Random class instances with the exact same sequence of numbers.

The MS official docs clarifies that “On the .NET Framework, initializing two random number generators in a tight loop or in rapid succession creates two random number generators that can produce identical sequences of random numbers.”

If we look at the C++ source for that legacy code, we can see that the seed is derived from the system timer’s Environment.GetTickCount:

void Init()
{
    LIMITED_METHOD_CONTRACT;
    LARGE_INTEGER time;
    if (!QueryPerformanceCounter(&time))
        time.QuadPart = GetTickCount();
    Init((int)time.u.LowPart ^ GetCurrentThreadId() ^ GetCurrentProcessId());
}

We can also see that there are uses of “CurrentThread” and “CurrentProcess” ID values, but other than these, there’s not much else going on to create the “entropy” needed for creating strong random numbers.



Random number seed generation for .NET Core 2.x to .NET 5

According to the docs, “In .NET Core, the default seed value is produced by the thread-static, pseudo-random number generator.”

Comment: I personally found this confusing (but that’s probably just me) - it seems to describe a meta-situation of a PRNG being used to seed another PRNG. It prompted the question - “so, what actually generates that original seed?”

Looking at the historic entry for .NET Core source for Random from May 2018, we can see that there is a method called GenerateGlobalSeed() that looks like the following;

private static unsafe int GenerateGlobalSeed()
{
    int result;
    Interop.GetRandomBytes((byte*)&result, sizeof(int));
    return result;
}

Specifically, in this code we can see the line Interop.GetRandomBytes(...). This is the call to the host OS, which in turn generates a cryptographically strong random array of bytes that can then be converted into an integer.



Random number seed generation for .NET 6.x

Reading through the preview source code for .NET 6.x, as available at the time of writing, we can see that there is an interesting reimplementation of the ‘System.Random’ class.

The changes are partially structural, meaning that the class can now accommodate different implementations of different PRNG algorithms.

.NET 6.x introduces an implementation of a much newer PRNG algorithm called “Xoshiro”. We’ll dig into that later in this article, but for now let’s just look at part of the source code:

public unsafe XoshiroImpl()
{
    uint* ptr = stackalloc uint[4];
    do
    {
        Interop.GetRandomBytes((byte*)ptr, 4 * sizeof(uint));
        _s0 = ptr[0];
        _s1 = ptr[1];
        _s2 = ptr[2];
        _s3 = ptr[3];
    }
    while ((_s0 | _s1 | _s2 | _s3) == 0); // at least one value must be non-zero
}

Again, we can see that just as with the .NET Core 2.x implementation, there is that same Interop.GetRandomBytes(...): .NET interop call to the host OS used to generate the seed. So, in this respect, .NET Core and .NET 6 are the same.



image of single tulip

What about GUIDs?

GUIDs (aka UUID - GUID is more of a Microsoft naming convention) are another staple of randomness found in software.

According to wikipedia, UUIDs have been around since the 1980s. So if they’re that old, surely new UUIDs can’t have been generated to cryptographically strong standards right?

Historically that may have been an issue, but in Windows - at least since the release of Windows 2000 (Dec 1999) - GUIDs are generated with cryptographic randomness.

It’s useful to know that there are different RFC standards applicable to UUIDs. The latest, version 4, specifically defines how a UUID should be generated.

In RFC 4122 section 4.4 it is stated that “The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers”

Specifically to Window OS, the Microsoft Whitepaper MS-SECO -Windows Security Overview states in section 2.5.5.2 that:

“For reasons of increased privacy protection for our customers, Microsoft systems beginning with Windows 2000 prefer to generate version 4 GUIDs in which the 122 bits of nonformat information are random. Although only a small minority of version 4 GUIDs require cryptographic randomness, the random bits for all version 4 GUIDs built in Windows are obtained via the Windows CryptGenRandom cryptographic API or the equivalent, the same source that is used for generation of cryptographic keys. This source is FIPS 140-certified in various versions of Windows…”



GUID generation in .NET

If we dig into the .NET source code for GUID generation, we actually come across two implementations - one for Windows and the other for Unix.

The Windows implementation of GUID generation, includes the following line of code.

int hr = Interop.Ole32.CoCreateGuid(out Guid g);

This shows us that .NET is using the built-in Windows API (RPC) for generating GUIDS, that we talked about just a moment ago

However, the Unix implementation of GUID generation is quite different and instead uses:

Interop.GetCryptographicallySecureRandomBytes((byte*)&g, sizeof(Guid));

We’ll recognise that this is the same code that we’ve seen elsewhere in .NET, used to generate cryptographically strong random numbers.

Comment: Is one version “better” than the other? - I’ve struggled to research an answer here.

  • I couldn’t find a link to the source code for the Windows GUID creation code, so I don’t think that we have public visibility of this, as this is part of the Microsoft internal Windows source code. But we do know that according to that MS whitepaper, this Windows-GUID-Generation code is specifically FIPS certified. We’ll return to consider FIPS in a moment.
  • We can say that the Unix implementation is as cryptographically strong as everything else in .NET [that is calling the underlying OS crypto libraries]. There will be no specific FIPS compliance, unless the OS is providing this.
  • I found it interesting that there has been a specific decision to implement two solutions rather than using the same Interop.GetCryptographicallySecureRandomBytes(...) for both Windows and Unix - I’m really curious to know why, especially as previously read in the docs that .NET does not enforce FIPs compliance.



Sorting database fields with a GUID

Before we finish with the subject of GUIDs, I wanted to share something that is a little out of place in this article, but is a technique that I’ve been using for many many years when working with databases. This is a classic one of those “something seasoned developers know about, but people new to the industry may not have come across before”.

If you didn’t already know, it’s possible to randomly sort fields in a database by SORTing with a SQL NewID():

SELECT * FROM MyTable.MyField BY NEWID()

Whilst fact-checking that advice, I came across this article on Stack Exchange : What is the best way to get a random ordering? that queries the performance impact of that operation, when working on big databases - so use this suggestion with consideration!



Is .NET compliant with the Federal Information Processing Standard (FIPS)?

The relevance of FIPs to random number generation is really pretty niche. This is a section that should probably have been cut out during editing - but I’ve erringly decided to retain it, as I felt that it may possibly help someone, to provide context to the previous section where we mentioned FIPS compliance in relation to GUID generation.

The following is a direct cut+paste quote from the official docs Microsoft : .NET Core Federal Information Processing Standard (FIPS) compliance :

The Federal Information Processing Standard (FIPS) Publication 140-2 is a U.S. government standard that defines minimum security requirements for cryptographic modules in information technology products, as defined in Section 5131 of the Information Technology Management Reform Act of 1996.

.NET Core:

  • Passes cryptographic primitives calls through to the standard modules the underlying operating system provides.
  • Does not enforce the use of FIPS Approved algorithms or key sizes in .NET Core apps. The system administrator is responsible for configuring the FIPS compliance for an operating system.

If code is written for a FIPS-compliant environment, the developer is responsible for ensuring that non-compliant FIPS algorithms aren’t used.”

This tells us that compliance functionality is an operating-system concern. This will most likely require that a specialist ensures that the OS is properly configured and that this may require that certified hardware is available on the platform.

Looking around to see how this is provided by different platforms:

  • For Windows, according to this document MS Docs : Using Windows in a FIPS 140-2 approved mode of operation , a special “FIPS Mode” needs to be configured, which in turn causes a self-test to be run before the cryptographic libraries can be used.
  • For Mac OS, according to this 2019 support article Apple FIPS Cryptographic Modules v9.0 for Intel for macOS Mojave 10.14 , there is no requirement to setup or configure any sort of “FIPS mode”.
  • For Linux, there are various distros, but arbitrarily picking Ubuntu for example; according to the 2020 article Canonical : FIPS 140-2 certification for Ubuntu 18.04 LTS , they offer different distributions for different customers.
    • Canonical mentions that “When a FIPS-validated software is modified in any way (including patching), it loses its certification and will need to be re-certified. The recertification process can easily stretch to months depending on the changes, however, Canonical offers flexibility with both FIPS certified and FIPS compliant updates available.”




image of donald knuth

A brief background and history of PRNG algorithms

Next, we’re going to head in the direction of a decidedly geeky geek-out…!

We’ll put names to the people, whose contribution to computer science, we use every day.



Knuth’s subtractive number generator

Donald Knuth invented the “subtractive number generator” PRNG.

Knuth is an eminent computer scientist, mathematician and Professor Emeritus of Stanford University in the USA.

Knuth is the author of the book series The Art of Computer Programming. These books were first published in 1968 and are commonly regarded as fundamental reference material within computer science academia.

The second volume of his book,Seminumerical Algorithms discusses an algorithm called a “subtractive number” generator method of creating pseudo-random numbers.

If you’re curious, there is a short excerpt from this book, related to random numbers, available on the publishers website.

I found an excellent detailed technical article at RosettaCode: Subtractive generator which gives a detailed explanation of the theory along with code in examples in multiple languages.

Note: A version of Knuth’s subtractive number generator is used to implement the System.Random class in both the legacy .NET Framework and .NET Core.

Suggestion: If you are further interested in stories about people, I came across this a recent 2020 interview with Knuth, now in his eighties.



Xorshift random number generators

George Marsaglia discovered the Xorshift generators.

Marsaglia was another Professor Emeritus of mathematics and computer science, this time from Washington State University. He is another specialist in the field of random numbers, including having developed tests to measure the quality of random numbers.

The name “XorShift” comes from the operator functions used - “Xor” and “Bit Shifting”.

The generator works by taking the Logical Exclusive OR (XOR) of a number combined with a bit-shifted version of itself

If you’re interested in the university whitepaper, a PDF copy can be downloaded from the Journal of Statistical Software

Note: “Xorshift” is not used in any .NET base classes, but is included in this article to help chart the evolution of these algorithms.



Xorshift+ random number generators

Makato Matsumoto and Mutsuo Saito proposed the idea of “XorShift+” : Hiroshima University : XORSHIFT-ADD (XSadd)

Matsumoto is a professor in the department of mathematics at Hiroshima University in Japan. This professor doesn’t have a Wikipedia entry, but I found the following resources about him:

Saito has little information available online (that I could find) other than that they also work in the department of mathematics at Hiroshima University.


The original Xorshift by Marsaglia, has code that is small and fast. However, a limitation of this code is that it doesn’t statistically produce a spread of numbers good enough for all uses.

“Xorshift+” builds upon the original by further applying a “non-linear transformation”.

According to this article, “Xorshift+” works by “adding consecutive outputs of an underlying Xorshift generator”.

This paper Sebastiano Vigna : Further scramblings of Marsaglia’s Xorshift generators digs deeply into the theory behind these random generators and their relative effectiveness at generating random numbers.

Finally, I found this 2018 blog by CodingHaus : Fastest C# Random Number Generator: XorShift+ which includes code examples in C#, which we may find useful if we wanted to roll our own generator instead of using System.Random.

Note: As with “Xorshift”, “Xorshift+ ” is not used in any .NET base classes.



Xoshiro / Xoroshiro random number generators

Finally, let’s introduce Sebastiano Vigna who, in collaboration David Blackman, designed a family of “linear engine” random number generators called Xoroshiro/Xoshiro

Vigna, is a professor in the department of computer science at the University of Milan, Italy.

I could find no information about Blackman, other than the passing reference at the top of the university article about the PRNG.

Despite the similar-sounding name, the “Xoroshiro/Xoshiro” algorithms approaches the problem differently from the preceding “Xorshift” family.

  • There are several flavours within this family of algorithm, distinguished by a mixture of slightly different naming:
    • The name “Xoroshiro” derives from “XOR/rotate/shift/rotate”, whilst “Xoshiro” drops a step and derives from “XOR/shift/rotate”.
    • Bit size - related to the floating-point precision of the numbers used. This applies to the strength of the random generation. Different versions are intended for use on different hardware platforms.
    • The + and * suffixes relate to the further inclusion of scrambling algorithms.
  • We can read more about the different versions at Vignas University of Milan subsite : xoshiro / xoroshiro generators and the PRNG shootout.
  • For those with strong academic ability, the PDF of Whitepaper : Scrambled Linear Pseudorandom Number Generators provides an in depth description.

    • According to their paper, Black and Vigna identify that their algorithm is an “extremely fast generator that uses a few hundred bits of memory, has provable properties and passes very strong statistical tests”.
    • The foreword of that same paper also identifies the actual improvements that were introduced over previous generators. This includes “handcrafted linear transformations” that have good statistical properties and processor performance, tests to ” uncover previously unknown biases” and “scramblers” that “reduce or delete the linear artefacts”).
  • The Original C++ code for Xoroshiro is dated 2018

    • Microsoft have credited their use of the algorithm, within code-comments found in the implementation of their C# version.

Independently, Professor Melissa O’Neill of Harvey Mudd College, offers an opinion with A Quick Look at Xoshiro256** . Consider also checking out Reddit : New line of fast PRNGs released by the author of the Google’s V8 PRNGxoshiro.di.unimi.it/ where these professors exchange thoughts.

Note: Versions of the Xoshiro number generator are used in the System.Random class in the forthcoming .NET 6.x onwards. At time of writing, these changes are still in preview. We can expect the official docs to be revised later this year when .NET 6 is released in Nov 2021.



image of man with spyglass

Exploring the .NET implementations of PRNG algorithms

Next, we’ll look at the actual C# implementations in the .NET base class library and pick out the key points of how it works.



Knuth’s subtractive number generator implementation of System.Random in .NET

This is the PRNG found in .NET Core 2.x

The entire and original source code for this class can be found here at GitHub : dotnet/runtime/src/libraries/System.Private.CoreLib/src/System/Random.cs

From the above source code, I’ve extracted out a near-identical version that we can play with in isolation. To help better understand it, you may find it really helpful to cut+paste this snippet into your preferred IDE, drop a bunch of breakpoints on it, and step through the iterations.

using System;

namespace SandBox_KnuthPRNG_ConsoleApp
{
    class Program
    {
        private static int[] _seedArray;
        private static int _inext;
        private static int _inextp;


        static void Main(string[] args)
        {
            CreateArray(1); // for the purpose of demo, we hardcode seed with "one".

            for (int i = 0; i < 100; i++)
            {
                Console.WriteLine(Sample());
            }
        }

        private static void CreateArray(int Seed)
        {
            // Initialize seed array.
            int[] seedArray = _seedArray = new int[56];

            int subtraction = (Seed == int.MinValue) ? int.MaxValue : Math.Abs(Seed);
            int mj = 161803398 - subtraction; // magic number based on Phi (golden ratio)
            seedArray[55] = mj;
            int mk = 1;

            int ii = 0;
            for (int i = 1; i < 55; i++)
            {
                // The range [1..55] is special (Knuth) and so we're wasting the 0'th position.
                if ((ii += 21) >= 55)
                {
                    ii -= 55;
                }

                seedArray[ii] = mk;
                mk = mj - mk;
                if (mk < 0)
                {
                    mk += int.MaxValue;
                }

                mj = seedArray[ii];
            }

            for (int k = 1; k < 5; k++)
            {
                for (int i = 1; i < 56; i++)
                {
                    int n = i + 30;
                    if (n >= 55)
                    {
                        n -= 55;
                    }

                    seedArray[i] -= seedArray[1 + n];
                    if (seedArray[i] < 0)
                    {
                        seedArray[i] += int.MaxValue;
                    }
                }
            }

            _inextp = 21;
        }

        private static int Sample()
        {
            int locINext = _inext;
            if (++locINext >= 56)
            {
                locINext = 1;
            }

            int locINextp = _inextp;
            if (++locINextp >= 56)
            {
                locINextp = 1;
            }

            int[] seedArray = _seedArray;
            int retVal = seedArray[locINext] - seedArray[locINextp];

            if (retVal == int.MaxValue)
            {
                retVal--;
            }
            if (retVal < 0)
            {
                retVal += int.MaxValue;
            }

            seedArray[locINext] = retVal;
            _inext = locINext;
            _inextp = locINextp;

            return retVal;
        }
    }
}

Let’s figure out the key elements within this code. The class has three internal properties:

  • _seedArray - an array of 56 elements (noting that index 0 is not used) that is used to store values generated by the CreateArray(..) method.
  • _inext and _inextp that are used by the Sample() method, combining values from the array to generate the actual “random number” returned by this class.



A code explanation: Creating the array of values

  • the variable mj is initialised by subtracting the seed value, that is provided as a method argument, from a second arbitrary number.
    • this arbitrary number happens to be the mathematic “phi” (the golden ratio) and used purely because it is considered a “Nothing up my sleeve” number, not because it has any other significance.
  • the variable mk is used to store the next generated number. This will have its value inserted into the indexed position within the array.
  • the variable ii is used as an array indexer.

There are two loops used in the process of generating the array of numbers:

  • The first loop is used to populate the array with an initial list of values. It uses a wrapping indexer that spaces itself out by 21 positions.
  • The second loop is used to further scramble the values, this time using a wrapping indexer that is spaced out by 30 positions.
  • This second loop is itself iterated upon five times, helping to create a more thorough scrambling effect.

Note: I couldn’t figure out, or research, why five was the number chosen for these additional iterations.

Within each iteration of the first loop:

  • the variable ii is incremented by 21. If it exceeds 55, then 55 is subtracted (wrapping the index back into the range of possible array values. This provides a “distribution behaviour” when it comes to inserting values into the array.
  • mk is recreated by subtracting its current value from mj
  • finally, mj is set to be the value of the number just inserted into the array

Within each iteration of the second loop:

  • the variable n is used as a second indexer, this time incrementing by 30. As before, with ii it is wrapped-back when it exceeds 55.
  • the current value of an array position, has its value modified by subtracting the value of n.

Note: I deliberated over whether I should rename the variables, in the above example, to names that were longer and more self-explanatory. I’ve decided to go with the original abbreviated names, so we could directly compare the original open-source code directly with my notes.



A code explanation: Sampling values from the array

Now that we have the array _seedArray populated with a selection of well scrambled values, we can use the method Sample() to generate the final random number.

  • This class is responsible for taking two of these numbers and subtracting one result from the other.
  • The two indexers _inext and _inextp are used to select the two values from the array.
    • _inext is left to default to 0 at the start.
    • _inextp is set to the value 21 by the previous CreateArray(..) step.
  • The two indexers are both incremented with each call to the Sample() method.
    • Both indexers are reverted to index position 1 when they exceed 55. This creates a wrapping behaviour.
  • Both indexers are always separated by a “distance” of 21 array elements.

As the indexers both increment upon each invocation, this means that they always have a changing combination of indices, as they advance around the array.

It is this final mechanism that enables the generation of a vast number of pseudo random numbers.



Xoshiro implementation of System.Random in .NET

This is the PRNG used by default in .NET 6.x

.NET 6.x actually implements two very similar versions:

Additionally, there is also a backwards compatibility implementation, which reverts back to using the Knuth algorithm. This is intended for use when we specify a seed number and presumably don’t want to break our legacy unit tests).

From the source code of the above linked xoshiro256** version, I’ve extracted out an adaptation that we can play with in isolation. As before, you may find it really helpful to cut+paste this snippet into your preferred IDE, drop a bunch of breakpoints on it, and step through the iterations.

The original code includes an initializer that calls Interop.GetRandomBytes() to generate a random seed. To simplify this demonstration, I have instead hardcoded the seed values.

using System;
using System.Numerics;

namespace SandBox_XoshiroPRNG_ConsoleApp
{
    class Program
    {
        private static ulong _s0, _s1, _s2, _s3;

        static void Main(string[] args)
        {
            GenerateRandomSeed();

            for (int i = 0; i < 100; i++)
            {
                Console.WriteLine(NextUInt64());
            }
        }


        public static void GenerateRandomSeed()
        {
            // -------------- the original code ---------------
            //ulong* ptr = stackalloc ulong[4];
            //do
            //{
            //    Interop.GetRandomBytes((byte*)ptr, 4 * sizeof(ulong));
            //    _s0 = ptr[0];
            //    _s1 = ptr[1];
            //    _s2 = ptr[2];
            //    _s3 = ptr[3];
            //}
            //while ((_s0 | _s1 | _s2 | _s3) == 0); // at least one value must be non-zero
            // --------------  

            // Note:  In this demo, instead of generating a set of true seed values, by setting up and using the OS Interop to generate random numbers using hardware,
            // we'll instead  manually creating some seed numbers for the sake of simplicity

            _s0 = ulong.MinValue;   // min
            _s1 = ulong.MaxValue;   // max is 18446744073709551615;  
            _s2 = ulong.MaxValue / 2;
            _s3 = ulong.MaxValue / 4;
        }

        private static ulong NextUInt64()
        {
            ulong s0 = _s0, s1 = _s1, s2 = _s2, s3 = _s3;

            ulong result = BitOperations.RotateLeft(s1 * 5, 7) * 9;
            ulong t = s1 << 17;

            s2 ^= s0;
            s3 ^= s1;
            s1 ^= s2;
            s0 ^= s3;

            s2 ^= t;
            s3 = BitOperations.RotateLeft(s3, 45);

            _s0 = s0;
            _s1 = s1;
            _s2 = s2;
            _s3 = s3;

            return result;
        }
    }
}

Compared to Knuth’s PRNG, there is surprisingly little code to be found in this Xoshiro algorithm.

It’s also worth observing that compared to Knuth’s algorithm which has lots of iteration (well, at least during initialization), this algorithm is largely about bit-shifting and copying, which should make it lighting fast.

The class has four state variables: _s0, _s1, _s2, _s3. These are used to maintain ulong numbers (unsigned 64-bit) as the state of the class.

There are three main operations in this code, from which the name “Xoshiro” is derived:



A code explanation: The XOR in Xoshiro

One key element here is the use of the C# bitwise Logical exclusive OR operator which appears in the code as the caret symbol ^.

Explaining what Bitwise operations are is pushing the scope of this blog, but very briefly:

  • essentially they are arithmetic operations directly supported by the processor.
  • this makes them extremely fast.
  • the bitwise XOR is used to take two “bit patterns” - if the input bits are both 0 or both 1, it returns a 0. If the input bits are different (e.g. a 0 and a 1 - or a 1 and a 0), it returns 1 as the result.

An issue with using XOR operators, is that if one of the seed values were to be zero, then the result would also always be zero:

  • The (original) code that creates seed values explicitly includes a loop and test to ensure that a zero-value is never generated.
  • The code does not let us manually specify a seed value (that is instead facilitated by the legacy support back to the original Knuth generator)



A code explanation: The Rotate in Xoshiro

BitOperations.RotateLeft is a relatively new feature, introduced in .NET Core 3.x. It can be found in the System.Numerics namespace.

Bit rotation is used to “shuffle” all the bits in one direction, with any bits that “fall off the end” added back to the other end

Note: There are two lines that I found very interesting, but I was unable to figure out or explain why those specific values for rotation and multiplication were used. I’m guessing that there will be some mathematical significance. If anyone knows the reason, drop me a line and I’ll update the article with a credit.

ulong result = BitOperations.RotateLeft(s1 * 5, 7) * 9;
s3 = BitOperations.RotateLeft(s3, 45);

Comment: there is a note on the official docs for BitOperations.RotateLeft that the API is not CLS-compliant. CLS-Compliance is related to ensuring libraries can be used with any CLR language. I’ve not explored further to understand if this could be an issue.



A code explanation: The Shift in Xoshiro

Combined with the XOR operation, the four ulong state variables are shifted back to front with each other. We can see that in the lines such as these:

s2 ^= s0; s3 ^= s1; s1 ^= s2; s0 ^= s3;



image of two women

What’s happening with Cryptographically Strong RNG in .NET 6.x?

With the upcoming release of .NET 6.x. RNGCryptoServiceProvider is earmarked for obsoletion, to be replaced with System.Security.Cryptography.RandomNumberGenerator.

RandomNumberGenerator is not new to .NET 6.x, as it’s been around in .NET Core for a while at this point.

What has changed in .NET 6.x is a move away from an implementation that uses RNGCryptoServiceProvider.

If we cast an eye over the lay of the land right now…

  • RNGCryptoServiceProvider has been marked with obsolete tags in the latest source code
  • This Github issue from 2020 discusses that .NET 6.X Apps should be using RandomNumberGenerator.Create().GetBytes(...)

Looking at the current source code we can see:



Further reading and resources used to create this blog









Archives

2021 (1)
2020 (26)
2019 (27)