Universally Unique Identifiers (UUIDs)

Module Code: ELEE1119 

Module Name: Advanced Computer Engineering

Credits: 30

Module Leader: Seb Blair BEng(H) PGCAP MIET MIHEEM FHEA

UUIDs

  • 32 base 16 characters ([0-9][A-F])

  • 128 bit numbers

center

UUID Version 1

ELEE1119 | Advanced Computer Engineering

UUID Versions

There are 8 versions as of 23 June 2022

  • UUID1 = Timebased + Unique or MAC [no repeats till 3603AD] ​

  • UUID2 = Timebased(LSB) + userid​

  • UUID3 = Namespace+MD5 hash​

  • UUID4 = PRNG [1 trillion UUIDs for a chance of 2 repeats]​

  • UUID5 = Namespace + SHA-1 hash​

  • UUID6 = Timestamp and monotonic counter.

  • UUID7 = UNIX Timestamp

  • UUID8 - User defined Data

ELEE1119 | Advanced Computer Engineering
vertcial

Who?

  • Created by Microsoft but standardised by the Internet Engineering Task Force (IETF) and the International Telecommunication Union (ITU), so that each user or thing can be uniquely identifiable.
  • ITU-T X.667 | ISO/IEC 9834-8
ELEE1119 | Advanced Computer Engineering

Where?

ELEE1119 | Advanced Computer Engineering

Combinations: UUID 4

  • grain of sand
  • UUID4 has 122 random bits,
  • Volume of sand as UUID4 =

...and the volume of Jupiter

ELEE1119 | Advanced Computer Engineering

Uniqueness: UUID4

  • In the version 4, 6 bits are fixed and the remaining 122 bits are randomly generated, for a total of possible UUIDs.

  • So if the number of generated UUIDs exceeds then there must be duplicates

  • If you assume perfect randomness you would expect to see collison after

  • After a few years you would get the first collisions.

ELEE1119 | Advanced Computer Engineering

UUID 1

  • combination of:

    • current time and date.
      • RFC 4122 60-bit count of 100 since 00:00:00:00 15 October 1582 to 01/01/1970
      • Current date time since 00:00:00:00 01 January 1970

  • 48-bit MAC address of the host machine
ELEE1119 | Advanced Computer Engineering
right:

Epochs/Time

  • 01/01/1970
    • Unix engineers set the arbitrary datetime stamp because... it was convenient...
  • 19/01/2038 03:14:07 the storage for 32-bit will become obsolete, as the value will be too large
    • Will need to migrate to 64-bit or just deal with the timestamp showing:
      • 19/01/1901 03:14:08
ELEE1119 | Advanced Computer Engineering

UUID 1: Time format

Name Bytes Hex Bits Comments
time_low 4 8 32 integer giving the low 32-bits of time
time_mid 2 4 16 integer giving the middle 16-bits of time
time_hi+version 2 4 16 16 bits of time high with bits 6-7 multiplexed with version number
clock_seq_hi_and_res clock_seq_low 2 4 16 1 to 3-bit "variant" in the most significant bits, followed by the 13 to 15-bit clock sequence
ELEE1119 | Advanced Computer Engineering

Example

  1. Current timestamp in Unix Epoch Time (ns) = .

  2. Divide the timestamp by 100 to convert it to 100-nanosecond intervals:

  1. Add the number of 100-nanosecond intervals between the UUID epoch (1582-10-15) and the Unix epoch (1970-01-01):

at the time of making the slide

ELEE1119 | Advanced Computer Engineering

Example Continued...

Breaking down the UUID components as follows:

  1. Time Low (32 bits): The first 32 bits of the timestamp in hexadecimal:

    $ printf "0x%08X\n" 1392857302
    > 0x530550D6
    
  2. Time Mid (16 bits): The next 16 bits of the timestamp in hexadecimal:

    $ printf "0x%04X\n" 7346
    > 0x1CB2
    
  3. Time High and Version (16 bits): The next 16 bits of the timestamp () with the version (1) in hexadecimal:

    $ printf "%04X\n" <<< echo $((7459+1))
    > 0x1D24
    
ELEE1119 | Advanced Computer Engineering

Example Continued....

  1. Clock Sequence (14 bits), in truth this can be 14 random bits so:

    $ printf "%04X\n" <<< echo $((RANDOM % 16384))
    > 21FF
    
  2. Node (48 bits): A randomly generated 48-bit value or MAC address if you must:

    $ dd if=/dev/urandom bs=1 count=6 2>/dev/null | od -An -tx1 | tr -d ' \n'
    > 2876c5202c7f
    
  3. Put it all togehter:

    • timeLow - timeMid - timeHigh+version - clockSeq - Node

    • 530550D6-1CB2-1D24-21FF-2876C5202C7F

ELEE1119 | Advanced Computer Engineering

UUID 2

  • Distributed Computing Environment (DCE)

  • combination of:

    • Current time and date.

    • The local identifier replaces the lower 32 bits of the timestamp.48-bit M1392857302AC address of the host machine

      • Domain Name or Hostname

        $ id -u; id -g; whoami;
        
      • MacAddress or random generated Hex -> sh1392857302

ELEE1119 | Advanced Computer Engineering

UUID 3

  • namespace could be website, DNS information, plain text, etc

  • the namespace value is hashed using the md5hash alogrithm

  • GNU Coreutils implements this using md5sum

    $ md5sum <<< "Test"
    > 2205e48de5f93c784733ffcca841d2b5  -
    

ELEE1119 | Advanced Computer Engineering

MD5 Algorithm

vertical

ELEE1119 | Advanced Computer Engineering

MD5 Alogrithm

  • Round 1: (b AND c) OR ((NOT b) AND (d))

  • Round 2: (b AND d) OR (c AND (NOT d))

  • Round 3: b XOR c XOR d

  • Round 4: c XOR (b OR (NOT d))

ELEE1119 | Advanced Computer Engineering

UUID​4

  1. Generate 128 random bits:
$ dd if=/dev/random count=16 bs=1 2> /dev/null | xxd -ps
> 7c1e598398eb691f3f4be4123c3ce9a7

[0]0111110 00001111 00101100 11000001 11001100 01110101 10110100 100011111 00111111 01001011 11100100 00010010 00111100 00111100 11101001 10100111

  1. Take the 7th byte and perform an AND operation with 0x0F to clear out the high nibble. Then, OR it with 0x40 to set the version number to 4.

00000100 = 10110100 & 00001111​ (0x0f)

01000100 = 00000100 | 01000000 (0x40)

ELEE1119 | Advanced Computer Engineering

UUID4 Example

  1. Next, take the 9th byte (00111111) and perform an AND operation with 0x3F and then OR it with 0x80.

00111111​ = 00111111​ & 00111111​ (0x3f)

100111111 = 00111111​ | 10000000 (0x80)

  1. Convert the 128 bits to hexadecimal representation and insert the hyphens to achieve the canonical text representation.​

Before: 7C1E5983-98EB-691F-3f4B-E4123C3CE9A7

After: 7C1E5983-98EB-441F-CF4B-E4123C3CE9A7

ELEE1119 | Advanced Computer Engineering

Your Turn

10101110 00100001 10110100 11111100 01101111 01110010 10100011 00110010 10111010 10000110 11010010 00001001 11010001 11100000 10101111 01101001 ​

  1. Take the 7th byte and perform an AND operation with 0x0F to clear out the high nibble. Then, OR it with 0x40 to set the version number to 4.​

  2. Next, take the 9th byte and perform an AND operation with 0x3F and then OR it with 0x80.​

  3. Convert the 128 bits to hexadecimal representation and insert the hyphens to achieve the canonical text representation.​

Answer

Before: AE21B4FC-6F72-A332-BA86-D209D1E0AF69

After: AE21B4FC-6F72-4332-BA86-D209D1E0AF69

ELEE1119 | Advanced Computer Engineering

UUID 5

  • namespace could be a website, DNS information, plain text, etc

  • the namespace value is hashed using the sha1 alogrithm

  • GNU Coreutils implements this using sha1sum

    $ sha1sum "Test"
    > 1c68ea370b40c06fcaf7f26c8b1dba9d9caf5dea  -
    
ELEE1119 | Advanced Computer Engineering

(RFC4122 Gregorian calendar no roll overs till 3603)

minecraft usese Version 4

50 quadrillion, 190 trillion and 8 1 quadrillion 434 trillion 281 billion 810 million 739 thousand and 360

The first UUID can be any of n possibilities, the second can be any of the n except the first (n-1), the third can be any except for the first two (n-2), and so on. And the total number of ways to generate r UUIDs is n^r since each of the r UUIDs has n different possibilities. 24 quadrillion 943 trillion 440 billion

Gregorian Calendar 15/10/1582 - The Gregorian calendar, also known as the Western calendar, is the most widely used calendar in the world today. Its predecessor, the Julian calendar, was replaced because it did not correctly reflect the actual time it takes the Earth to circle once around the Sun, known as a tropical year. Unix 01/01/1970 - The year 2038 problem is related to Unix time because times after 03:14:07 UTC on 19 January 2038 will require computers to store the value as greater than 32-bits.

Years designed as two digits not four. In January, a World Bank report estimated that only 21 of 139 developing countries had taken concrete steps to address the year 2000 problem. The report went on to anticipate year 2000 impacts on power, telecommunications, energy, food distribution and medical care in developing countries hundreds of billions of dollars spent to kill off the bug worldwide, Y2K ended as mere only glitches.

16384 because 2^14

md5sum needs to be stripped of whitespace and `-`

Each 512-bit block gets broken down further into 16 sub-blocks of 32 bits each. There are four rounds of operations, with each round utilizing all the sub-blocks, the buffers, and a constant array value. This constant array can be denoted as T[1] -> T[64]. Each of the sub-blocks are denoted as M[0] -> M[15].

Command - dd == data defintion arguments - if == read from FILE instead of stdin - count == Copy N blocks - bs == read/write byes 1 at a time - 2> errors redirected to blackhole Command - xxd == hexdump - ps == plain text from beginnning of stream

sha1sum generates more characters than needed so we must trim, usually from right hand side