|| Author: Cyberdude || Back to articles ||
Author: Ferraro Giuseppe aka Cyberdude
Mailto: viromarquantbello[at]libero[dot]it
Date  : December 2006

Modulate Arithmetic + Galois Field finished space base knowledges and use  of their for AES ciphrature
------------------------------------------------------------------------------------------------------

[0x0] Base Knowledge about Modulate Arithmetic
[0x1] Euclide's Greatest common divisor algorithm and asm implementation 
[0x2] The Modulate Polinomial Arithmetic in the Galois Field form GF(p^n)
[0x3] The Rijndael Advanced Encryption Standard ciphrature's structure introduction
 [0x3x1] The subphases of a single encryption phase
 [0x3x2] The Aes key's expansion from 4 to 44 words procedure 

----------------------------------------------------[ 0x0 - Base Knowledge about Modulate Arithmetic ]
If I have an integer 'n' and another integer 'a' that we can imangine n = 3 and a = 7, if I divide a/n
I obtain another integer that I can call 'q' and another integer that I can call 'r'. This  4  integer
have the next relaction : a/n = q+r, if I want do an exemple I can say that 7/3 = 2+1 since  3 * 2 = 6
and 6+1 = 7. If I want write the operation that I did, I can write in this way 7mod3 =  1.  The  value
'mod' rapresents the value r that remain when I divide 7/3. Now I do another exemple so all people can
understend this simple concept: 

How I can obtain 10mod3?? I must multiply 3 for several time while I can't obtain the value 10  ora  a
value that is approaching to 10. So I have that 
3*1 = 3 that is < 10
3*2 = 6 that is < 10
3*3 = 9 that is < 10
3*4 = 12 that is > 10

since 3*4 is 12 and 12 is > 10 I must return to 3*3 = 9. Since 9+1 = 10 I have that 10mod3 = 1!For the
same motive I have that 

+5  mod 2   = 1
+15 mod 12  = 3
+20 mod 5   = 0
-11 mod 7   = since I have -11 I take you an exemple how to resolve it :

7 * -1 = -7  that is > -11
7 * -2 = -14 that is < -11 

since -14 is the first value < -11 I have found my mod since -14 + 3 = -11!! So -11 mod 7 = 3. In  the
same way I can say that -10 mod 5 = 0 since 5 * -2 = -10 

An interesting caracteristic of the modulate arithmetic is that if I have an operation as the next: 
[(11mod8) + (15mod8)]mod8 I can resolve it in this way: 
11 mod 8 = 3
15 mod 8 = 7 
(3+7) = 10 
since he want that the result of this sum is calcoulated in mod 8 I have that 10mod8 = 2 and  in  this
case the result of this operation is 2 but is possible that I have an operation between differents mod
for exemple the same operation [(11mod8) + (15mod8)]mod8 have a different result if I  change  only  a
single one mod, so if I want obtain the result of [(11mod7) + (15mod8)]mod8 I have that:
11 mod 7 = 4
15 mod 8 = 7 
and the sum between 4 and 7 is 11 and 11mod8 is 3 and not 2. 
After that I have explain it to you we can return at the case in witch several operations are  execute
with the same mod, so we return to consider this : [(11mod8) + (15mod8)]mod8. If I want  resolte  this
operation is not necessary that I do 11mod8 and 15mod8 but I can do (11+15)mod8 and the result is  the
same that we have seid in the first time since 26mod8 = 8*3 =24 and 24+2=26 so 26mod8 = 2. In  general
we have that: 

[(a mod n) + (b mod n)]mod n = (a+b)mod n
[(a mod n) - (b mod n)]mod n = (a-b)mod n
[(a mod n) * (b mod n)]mod n = (a*b)mod n 

When we work in modulate arithmetic we work with a limitate group of integer. To explain this  concept
I report you an exeple: if I work in mod8 all value that I can obtain are all value x where 0<=x<8, in
generale when we work in mod n we have a group of x value where 0<=x<n, but return at the  case  where
we work in mod 8 and try to calcoulate the big number mod 8 to control if we obtain results >=8 :

1  mod 8 = 8*0 + 1  and 0  + 1 = 1  so 1  mod 8 = 1
2  mod 8 = 8*0 = 0  and 0  + 2 = 2  so 2  mod 8 = 2
10 mod 8 = 8*1 = 8  and 8  + 2 = 10 so 10 mod 8 = 2
15 mod 8 = 8*1 = 8  and 8  + 7 = 15 so 15 mod 8 = 7
30 mod 8 = 8*3 = 24 and 24 + 6 = 30 so 30 mod 8 = 6
32 mod 8 = 8*4 = 32 and 32 + 0 = 32 so 32 mod 8 = 0
51 mod 8 = 8*6 = 48 and 48 + 3 = 51 so 51 mod 8 = 3

as you can say the min value that we have obtenuted is 0 and the max is 7 since we consider a modulate
arithmetic in 8. 

We can obtain several class of value for all mod n that we want. To explain this concept I consider an
exemple in witch I use a mod4 arithmetic. As we have seid in the prev strings, since we work  in  mod4
we can obtain only 4 different results and this possible results are = 0 1 2 and 3. If I want create a
class in witch are there all value that in mod4 return the result 0 I must do a  simple  operation  as
showed next : 

[0] = 0 + 04 + 04 + 04 + 04 + 04  ...
[0] = 0   04   08   12   16   20  ...

and 

[0] = ... - 04 - 04 - 04 - 04 - 04 - 0
[0] = ...  -20  -16  -12  -08  -04   0

so for the result = 0 we have this class : 
[0] = { ... -20 -16 -12 -8 -4 0 4 8 12 16 20 ... } 
And if you try 
0mod4, 4mod4, -4mod4, 8mod4, -8mod4, 12mod4, -12mod4, 16mod4, -16mod4 etc, the result is the same  and
in this case is 0. In the same way we can obtain the classo of resuts [1] [2] and [3]. 

[0] = { ... -20 -16 -12 -8 -4 0 4  8 12 16 20 ... }
[1] = { ... -19 -15 -11 -7 -3 1 5  9 13 17 21 ... }
[2] = { ... -18 -14 -10 -6 -2 2 6 10 14 18 22 ... }
[3] = { ... -17 -13 -9  -5 -1 3 7 11 15 19 23 ... }

After that we can create several tables that show the resoutls of  several operation in mod n. Imagine
that we want obtain the table of results of addition operation in mod 8, we know that in mod 8 we have
8 possible value: 0,1,2,3,4,5,6,7 so we can create our table in this way: 

     +|0|1|2|3|4|5|6|7|
     -----------------
     0| | | | | | | | |
     1| | | | | | | | |
     2| | | | | | | | |
     3| | | | | | | | |
     4| | | | | | | | |
     5| | | | | | | | |
     6| | | | | | | | |
     7| | | | | | | | |

We know that (0+0)mod8 = 0 and (0+1)mod8 = 1 and (0+2)mod8 = 2 etc so the first row  is  rapresentated
by the succession 0 1 2 3 4 5 6 7 as showed next and the other rows are very similar to it

     +|0|1|2|3|4|5|6|7|
     -----------------
     0|0|1|2|3|4|5|6|7|
     1|1|2|3|4|5|6|7|0|
     2|2|3|4|5|6|7|0|1|
     3|3|4|5|6|7|0|1|2|
     4|4|5|6|7|0|1|2|3|
     5|5|6|7|0|1|2|3|4|
     6|6|7|0|1|2|3|4|5|
     7|7|0|1|2|3|4|5|6|

In the same way we can rapresent the table of multiplication: 

     x|0|1|2|3|4|5|6|7|
     -----------------
     0|0|0|0|0|0|0|0|0|
     1|0|1|2|3|4|5|6|7|
     2|0|2|4|6|0|2|4|6|
     3|0|3|6|1|4|7|2|5|
     4|0|4|0|4|0|4|0|4|
     5|0|5|2|7|4|1|6|3|
     6|0|6|4|2|0|6|4|2|
     7|0|7|6|5|4|3|2|1|

If we read the table showed up, we can say an important particoular : 
(6x3) mod 8 = 2 since 6x3 = 18 and 18 = 8*2 = 16 + [2] = 18
but 
(6x7) mod 8 = 2 too since 6x7 = 42 and 8*5 = 40 + [2] = 42
but 3mod8 is different from 7mod8, 3mod8 is 3 and 7mod8 is 7. The "problem" stay in the number 6 since
when we do a multiplication operation with a generic number x (in our exemple x is 6) if x have one or
several factor equals with the mod number (in our case 8) the 'r' values (the  results  of module) can
obtain the same result for differents value. To understend this concept you can read the next exemple:
We consider the mod8 and the number 6; 8 and 6 have the value 2 in their factor since 8:2  is possible
and 6:2 is possible, this particoular is the cause of this :

mod 8 value       : 00 01 02 03 04 05 06 07
number 6 multiply : 00 06 12 18 24 30 36 42
results of mod    : 00 06 04 02 00 06 04 02 

As we can see in the exemple (6x2)mod8 and (6x5)mod8 have as result 6 but 2mod8 = 2 and 5mod8 = 5!! If
we consider the value 5 and not 6, we obtain 8 different results for the first 8 multiply of  5  since
the number 8 and the number 5 have all factor different; 8 is divisible for 2 for 4 and for  8,  5  is
divisible only fo 5 (the value 1 is not considered). If we try to repeat the same  operation  that  we
have execute for the value 6 we obtain this

mod 8 value       : 00 01 02 03 04 05 06 07
number 5 multiply : 00 05 10 15 20 25 30 35
results of mod    : 00 05 02 07 04 01 06 03 

As we can see everyone result of the multiplication of 5 with the first 8 value < of 8  in  mod  8  is
no like everyone of the others result. 

--------------------------[ 0x1 - Euclide's Greatest common divisor algorithm and asm implementation ]
An interesting use of the modulate arithmetic is rapresentated by the Euclide's Algorithm. Euclide use
the mod arithmetic to obtain the Greatest common divisor between two number, if you don't know what is
the greatest common divisor (that now we name as gcd) i do a little exemple. If I have two number, for
exemple 24 and 18 I can say that the gcd between 24 and 18 is 6 becouse :

24 : 2 = 12   |   18 : 2 = 9
12 : 2 = 6    |    9 : 3 = 3 
6  : 2 = 3    |    3 : 3 = 1
3  : 3 = 1    |

so I can say that:
-> 24 = 2 * 2 * 2 * 3 
-> 18 =         2 * 3 * 3

the factor that 24 and 18 have in common are only 2*3 = 6 so... I can divide 24:6 and 18:6  and  6  is
the gcd between 24 and 18. If I have the value 24/18 I know that the maximum semplification of it  is:
4/3 that is 24/6 = 4 and 18/6 = 3. 

To obtain this gcd Euclide use a modulate arithmetic, in particoular, when we  have  two  number,  for
exemple 55 and 22 the algorithm find the max between they (in this case 55) and do:

=> minValue mod (maxValue mod minValue)

and repeat this operation while the minValue is different from 0 or  while the two value are the same.
To understend this operation we  can try to solve the gcd between  55    and  22  as  showed  in  this
procedure:

phase[1] max = 55 min = 22 -> 22 mod (55mod22) -> 22mod(11)
phase[2] max = 22 min = 11 -> 11 mod (22mod11) -> 11mod(11)
phase[3] max == min -> gcd = 11

The pseudocode of the Euclide algorithm used to find the gcd using the modulate arithmetic is this: 

EUCLIDE(a,b)
1. A<-a B<-b
2. If B = 0 return A = gcd(a,b)
3. R = A mod B
4. A<-B
5. B<-R
6. goto 2

I have implementated it in assembler AT&T and I have thinked the next implementation but  everyone can
try to implmenet it as he want and using the lenguage that he preferred: 

The Code structure that I have thinked is composit of  three  principal function and everyone function
have several procedure, the most important is the main procedure that is the function runned when  the
process started. The procedure module is the procude that apply the modulate arithmetic (that  in  the
pseudocode we have named gcd between a and b) and the control procedure is used to  transform  negativ
value (for exemple -3) in positiv (in this case 3). This control procedure is correct becouse the  two
number that are gived in input by the user can be positiv or negativ but  the  algorithm  consider the
absolute value of they since gcd(60,24) == gcd(60,-24) == 12.

[ module ]           [ control ]           [ main ]
 |_[module]           |_[control]           |_[main]
 |_[execute]          |_[change]            |_[takeValues]
 |_[compare]          |_[returnValue]       |_[controlValues]
                                            |_[findMax]
                                            |_[isLessThan]
                                            |_[areEquals]
                                            |_[continue]
                                            |_[while]
                                            |_[printgcd]
                                            |_[quit] 

When the asm binary compiled code is runned the main procedure is called. 
[main]-->[takeValues] 
              |
               --> the user insert value 1 ---
               --> the user insert value 2 -- |
                                             ||
                                             \/
                                       [controlValues]-->[control]
                                                           |   |
                                                           V   V
                                        (else)<--(if value 1 or value 1 is < 0)
                                          |                   |
(the two values are positiv)<----------   |                   |
                                       |  V                   |
                                   [returnValue]<--[change]<--|
                                          |            |
                                          |             --------->(transform -3 to 3)
                                          V
    -----------------------(now we have 2 positiv values)
   |
   |                                         |--->(excange the position of value 1 and value 2)
   V                                         |
[findMax]-->(if value 1 is < value 2)-->[isLessThan]-->(now I have value 1 > value 2)
                     |                                             |
                     |               |--->(print one of their)     |
                     V               |                             |
            (else if are Equals)-->[printgcd]-->[quit]             |
                     |                                             |  -->(start the while procedure)
                     V                                             V |
                  (else)--------------------------------------->[while]-->(if is finished)
                                                                    ^        |   |
                                                                    |        |   |
                                           -------------------------         |   |
                                          |                                  |   |
                                          |        -----------------(else)<--|   |
                                          |        |                             V
                                          |        V                         [printgcd]
                                          |     [module]                         |
                                          |        |                             V
                                          |        V                           [quit]
                                          |    [compare]-----
                                          |        |         |
                                          |        V          -->(control if the values are different)
                                           -----[execute]
                                                   |
                                                    ---> (calcoulate the module)


------[ Euclide.s - compile it using gcc Euclide.s ]--------------------------------------------------
------------------------------CUT HERE----------------------------------------------------------------

caption:        .string "[ Euclide's Greatest common divisor algorithm - asm implementation ]"
value1:         .string "\t[I]nsert the first value: "
value2:         .string "\t[I]nsert the second value: "
input:          .string "%d"
gcd:            .string "\n --> GCD = %d\n"

.globl module
module:
        pushl   %ebp                    /* Here I alloc the space that I use */
        movl    %esp, %ebp              /* in the my function named module! */
        subl    $16, %esp

        movl    $0, -8(%ebp)            /* Now I store the value 0 in -8(%ebp) */
        movl    12(%ebp), %eax          /* and take the value stored in 12(%ebp) */
        movl    %eax, -4(%ebp)          /* and move it in %eax after I move this */
        jmp     compare                 /* value in -4(%ebp) and jump to compare */

execute:
        movl    -8(%ebp), %eax          /* Here I move the value stored in -8(%ebp) */
        imull   -4(%ebp), %eax          /* in %eax and multiply it with the value in */
        movl    %eax, 12(%ebp)          /* -4(%ebp), the result is stored in 12(%ebp)*/
        leal    -8(%ebp), %eax          /* After that leal the space from -8(%ebp) in %eax*/
        addl    $1, (%eax)              /* And increment the counter (stored in -8(%ebp)*/

compare:
        movl    12(%ebp), %eax          /* Here I take another time the value */
        cmpl    8(%ebp), %eax           /* stored in 12(%ebp) and compare it with */ 
        jl      execute                 /* 8(%ebp), if the value in 12(%ebp) is < */
                                        /* of the value in 8(%ebp) I jump to execute */
                                        /* procedure */

        leal    -8(%ebp), %eax          /* If it isn't < of 8(%ebp) I leal the counter */
        subl    $2, (%eax)              /* and decrement it of 2 integer */
        movl    -4(%ebp), %eax          /* After move the value stored in -4(%ebp) in %eax*/
        imull   -8(%ebp), %eax          /* And multiply it with the value stored in -8(%ebp)*/
        movl    %eax, 12(%ebp)          /* Store the result in 12(%ebp) and move it in %edx*/
        movl    12(%ebp), %edx          /* After move 8(%ebp) in %eax and do a subtraction */
        movl    8(%ebp), %eax           /* After that return the value that is the result*/
        subl    %edx, %eax              /* of subtraction operation and here end the module */
        leave                           /* method */
        ret

.globl control
control:
        pushl   %ebp                    /* Here I alloc the space that I will use in */
        movl    %esp, %ebp              /* the control method */
        subl    $4, %esp

        cmpl    $0, 8(%ebp)             /* After that I compare the value 0 with the */
        jle     change                  /* value stored in 8(%ebp) and if it is <0 I */
                                        /* jump to change procedure */

        movl    8(%ebp), %eax           /* Else I move the value stored in 8(%ebp) in %eax */
        movl    %eax, -4(%ebp)          /* And move this value in -4(%ebp) becouse this is */
        jmp     returnValue             /* the value that the returnValue procedure must return */ 

change:
        movl    8(%ebp), %eax           /* Here I move the 8(%ebp) value in %eax and */
        negl    %eax                    /* do -(value) so if the value is -10 I do -(-10)*/
        movl    %eax, -4(%ebp)          /* that is 10 and move the result in -4(%ebp) */

returnValue:
        movl    -4(%ebp), %eax          /* After that move this value in %eax and return */
        leave                           /* it to the main function */
        ret

.globl main
main:
        pushl   %ebp                    /* In the first time I alloc the space that I use */
        movl    %esp, %ebp              /* in the main procedure */
        subl    $56, %esp

        pushl   $caption                /* After that push the string caption in (%esp) */
        call    puts                    /* and print it to screen */
	
        pushl   $value1                 /* Now print the string value1 */
        call    printf

takeValues:
        leal    -16(%ebp), %eax         /* leal -16(%ebp) in %eax and push this space */
        pushl   %eax                    /* in the stack, after push the pointer to int */
        pushl   $input                  /* And receiv the first input from the user */
        call    scanf                   /* with the printf procedure */

        pushl   $value2                 /* Here I do the same that I do in the first time */
        call    printf                  /* with the value1 string, here I obtain the value */
                                        /* of the second integer */

        leal    -20(%ebp), %eax
        pushl   %eax
        pushl   $input
        call    scanf

controlValues:
        movl    -16(%ebp), %eax         /* Now I move the first value obtenuted in %eax */
        pushl   %eax                    /* and take it in input to the control procedure */
        call    control
        movl    %eax, -16(%ebp)         /* I refresh the value stored in -16(%ebp) with the */
                                        /* output of the control procedure */

        movl    -20(%ebp), %eax         /* Here I do the same with the second value */
        pushl   %eax
        call    control
        movl    %eax, -20(%ebp)

findMax:
        movl    -16(%ebp), %edx         /* Now I move the two values in %edx and %eax */
        movl    -20(%ebp), %eax         /* And if -20(%ebp) is <= of -16(%ebp) i jump */
        cmpl    %eax, %edx              /* to the isLessThan procedure */
        jle     isLessThan

        movl    -16(%ebp), %eax         /* Else I move the value stored in -16(%ebp) and */
        movl    %eax, -12(%ebp)         /* -20(%ebp) in -12(%ebp) and -8(%ebp) a copy of */
        movl    -20(%ebp), %eax         /* the two value and jump to the while procedure */
        movl    %eax, -8(%ebp)
        jmp     while

isLessThan:
        movl    -20(%ebp), %edx         /* Here I compare the two values to know if */
        movl    -16(%ebp), %eax         /* are different or equals, if they are equals */
        cmpl    %eax, %edx              /* I jump to the areEquals procedure */
        jle     areEquals

        movl    -20(%ebp), %eax         /* Else if are different I move the two values in */
        movl    %eax, -12(%ebp)         /* -12(%ebp) and -8(%ebp), two copy of this value */
        movl    -16(%ebp), %eax         /* and jump to the while procedure*/
        movl    %eax, -8(%ebp)
        jmp     while

areEquals:
        movl    -16(%ebp), %eax         /* If the two values are the same I move one of they */
        pushl   %eax                    /* in %eax so i push it in (%esp) stack and push the */
        pushl   $gcd                    /* gcd string too and call the printf procedure */
        call    printf

        movl    $0, -36(%ebp)           /* After that move che value 0 in -36(%ebp) and call */
        jmp     quit                    /* the quit procedure*/

continue:
        movl    -8(%ebp), %eax          /* Here I push the two values and jump to the module */
        pushl   %eax                    /* procedure to obtain the mod about the two values */
        movl    -12(%ebp), %eax
        pushl   %eax
        call    module

        movl    %eax, -4(%ebp)          /* I store the restults of this module procedure */
        movl    -8(%ebp), %eax          /* -4(%ebp) to refresh his value, after refresh the */
        movl    %eax, -12(%ebp)         /* value stored in -12(%ebp) and in -8(%ebp) */
        movl    -4(%ebp), %eax
        movl    %eax, -8(%ebp)

while:
        cmpl    $0, -8(%ebp)            /* In this procedure I compare the value stored in */
        jle     printgcd                /* -8(%ebp) with the value 0 and if it is <0 jump */
                                        /* to the printgcd procedure */

        movl    -8(%ebp), %eax          /* else compare the two values and jump to continue */
        cmpl    -12(%ebp), %eax
        jl      continue

printgcd:
        movl    -16(%ebp), %eax         /* Here I push the two value in the stack and call */
        movl    -12(%ebp), %edx         /* the module procedure to obtain a mod between the */
        pushl   %edx                    /* two values */
        pushl   %eax
        call    module

        pushl   %eax                    /* Push the result of this module in the stack and */
        pushl   $gcd                    /* push the gcd string too so call the printf */
        call    printf

        movl    $0, -36(%ebp)           /* After set to 0 the value storedi in -36(%ebp) */

quit:
        movl    -36(%ebp), %eax         /* I Move this value in -36(%ebp) in %eax and call */
        leave                           /* the return(0) procedure to exit */
        ret

------------------------------CUT HERE----------------------------------------------------------------

-------------------------[ 0x2 - The Modulate Polinomial Arithmetic in the Galois Field form GF(p^n) ]
A very important arithmetich used in the encryption study is the modulate polinomial arithmetic.  This
particoular arithmetic lets to obtain the  same  number  of  number  different  from 0 in  a  finished
modulate space. You can understend this concept in the second time, for the moment I continue.The poly
arithmetic consist in a polinomial rapresentation of binary values and several operation between there
We can imagine that want obtain the table of addition and moultiplication value in a modulate 8 space.
So all my values must be rapresentated mod8. A normal modulate table can be this: 

 + [0][1][2][3][4][5][6][7]
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]

we can insert a value in this table as we have seed in the prev paragraphe, for exemple  7+7 = 14  and
we know that 14mod8 = 8*1 = 8 + [6] = 14 so 14mod8=6... if we apply this modulate 8 arithmetic  to all
values we obtain this table: 

 + [0][1][2][3][4][5][6][7]
[0][0][1][2][3][4][5][6][7]
[1][1][2][3][4][5][6][7][0]
[2][2][3][4][5][6][7][0][1]
[3][3][4][5][6][7][0][1][2]
[4][4][5][6][7][0][1][2][3]
[5][5][6][7][0][1][2][3][4]
[6][6][7][0][1][2][3][4][5]
[7][7][0][1][2][3][4][5][6]

For the moultiplication table is the same 7x7 = 49 and 49mod8 = 1, so  the  moultiplication  table  in
mod8 is showed next

 x [0][1][2][3][4][5][6][7]
[0][0][0][0][0][0][0][0][0]
[1][0][1][2][3][4][5][6][7]
[2][0][2][4][6][0][2][4][6]
[3][0][3][6][1][4][7][2][5]
[4][0][4][0][4][0][4][0][4]
[5][0][5][2][7][4][1][6][3]
[6][0][6][4][2][0][6][4][2]
[7][0][7][6][5][4][3][2][1]

At this point you can understand the concept that I have explained first, as  you  can  see  the  odds
columns contains all value of mod8, the evens repeat several times the same value, for exemple the row
1 contain this values: [0][1][2][3][4][5][6][7] and the row  2  contain this: [0][2][4][6][0][2][4][6]
We can see that in the row number 2 there aren't the value 1 3 5 and 7 but are repetuted several times
0 2 4 and 6. The same happened for all evens rows. As I have seid  in  the  first  time  the  modulate
arithmetic polinomial in GF(2^n) solve this problem. In the first time  all  value  are interpreted in
binary form, this form is used to convert the value in polynomial form. The next table show all values
that I can obtain usign 3 bit (since I have used a mod(8) in my prev exemple)

=> 000 = 0
=> 001 = 1
=> 010 = 2
=> 011 = 3
=> 100 = 4
=> 101 = 5
=> 110 = 6
=> 111 = 7

As you can see with 3 bits I obtain all values that belongs to mod(8) finished space since the  binary
arithmetic is an arithmetic that use a base = 2 and 2^3 = 8. After that we can rapresents  the  binary
values in polinomial form, this "transformation" is very simple as showed next:

=> 000 = 0(x^2)+0(x^1)+0 = 0       = 0
=> 001 = 0(x^2)+0(x^1)+1 = 1       = 1
=> 010 = 0(x^2)+1(x^1)+1 = x       = 2
=> 011 = 0(x^2)+1(x^1)+1 = x+1     = 3
=> 100 = 1(x^2)+0(x^1)+0 = x^2     = 4
=> 101 = 1(x^2)+0(x^1)+1 = x^2+1   = 5
=> 110 = 1(x^2)+1(x^1)+0 = x^2+x   = 6
=> 111 = 1(x^2)+1(x^1)+1 = x^2+x+1 = 7

After that we have obtenuted the polynomial values of a bin rapresentation of mod(8)  numbers  we  can
use they to operate addition, moultiplication and division operations that  will  be used  in  several
encryption algorithms.

The sum operation: Imagine that we want do 
operation   :     7     +   4
binary form :   (111)   + (100)
poly form   : (x^2+x+1) + (x^2)
execution   :  x^2+x+1+x^2 (the same values are deleted) : x+1 = 011 = 3 
result      :  3

The mul operation: Imagine that we want do
operation   :  7 * 4 
binary form : (111) * (100)
poly form   : (x^2+x+1) * (x^2)
execution   : we must sum the exponents : x^4 + x^3 + x^2 = 11100 = 28 
result      : 28 

The div operation: (for this exemple we use other numbers since 4/7 is very simple)
operation   :        11 /   16
binary form :    (1011) / (10000)
poly form   : (x^3+x+1) /  (x^4)
execution   : In the first time I must "transform" the x^3+x+1 equals to the max value of 16 (in  this
              case I must transform x^3+x+1 to x^4 so I can multiply all exepression x^3+x+1 for x and
              obtain x^4+x^2+x) 

              x^4 == 16
              x^4 + x^2 + x == my x^3+x+1 trasformed 
             -------------- I must remove the same values 
              //  + x^2 + x = 110 = 6

result      : 6

Don't warry if in your mathematic 11 / 16 is different from 6 becouse this is another mathematic :p

After that I have showed you as you can solve elementary operation  of  addition  moultiplication  and
division we can calcoulate out table in GF(2^n) that since first we have calcoulated in mod(8)  now we
must work on GF(2^3) finished space.

The addition table

 + [0][1][2][3][4][5][6][7]
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]

in GF(2^3) is this :

 + [0][1][2][3][4][5][6][7]
[0][0][1][2][3][4][5][6][7]
[1][1][0][3][2][5][4][7][6]
[2][2][3][0][1][6][7][4][5]
[3][3][2][1][0][7][6][5][4]
[4][4][5][6][7][0][1][2][3]
[5][5][4][7][6][1][0][3][2]
[6][6][7][4][5][2][3][0][1]
[7][7][6][5][4][3][2][1][0]

You can calcoulate it as showed first, for exemple 6+2 = 110 + 010 = (x^2+x)+(x) = x^2 = 100 = 4

The moultiplication table is this: 

 x [0][1][2][3][4][5][6][7]
[0][0][0][0][0][0][0][0][0]
[1][0][1][2][3][4][5][6][7]
[2][0][2][4][6][3][1][7][5]
[3][0][3][6][5][7][4][1][2]
[4][0][4][3][7][6][2][5][1]
[5][0][5][1][4][2][7][3][6]
[6][0][6][7][1][5][3][2][4]
[7][0][7][5][2][1][6][4][3]

To obtain this values you must apply the moultiplication operation first and, if the resoult value  is
greater than 7 (since we are working in mod(8) the values that we can accept are only 0,1,2,3,4,5,6,7)
you must divide this value with the poly x^3+x+1 that is a irriducible polinomy of grade 3 ( since  we
use in our exemple three bits). For exemple if you have 7x7 this operation is: 

7x7 = (111)*(111) = x^2+x+1 * x^2+x+1 = x^4+x^3+x^2+x^3+x^2+x+x^2+x+1 = x^4+x^2+1 = 10101 
since x^4 > x^2 we must do x^3+x+1 / x^4+x^2+1

pass[1] (x^3 +x +1) *x = x^4+x^2+x
pass[2] (x^4+x^2+ +1)
        (x^4+x^2+x  )
        -------------
          / + / +x+1 = 011 = 3

To the end we have obtenuted this two matrix of addition and multiplication operation of all values in
a finished space GF(2^3)

                  Addition matrix                       Moultiplication matrix 
            -----------------------------            ---------------------------
           |                             |          |                           |
              + [0][1][2][3][4][5][6][7]              x [0][1][2][3][4][5][6][7]
             [0][0][1][2][3][4][5][6][7]             [0][0][0][0][0][0][0][0][0]
             [1][1][0][3][2][5][4][7][6]             [1][0][1][2][3][4][5][6][7]
             [2][2][3][0][1][6][7][4][5]             [2][0][2][4][6][3][1][7][5]
             [3][3][2][1][0][7][6][5][4]             [3][0][3][6][5][7][4][1][2]
             [4][4][5][6][7][0][1][2][3]             [4][0][4][3][7][6][2][5][1]
             [5][5][4][7][6][1][0][3][2]             [5][0][5][1][4][2][7][3][6]
             [6][6][7][4][5][2][3][0][1]             [6][0][6][7][1][5][3][2][4]
             [7][7][6][5][4][3][2][1][0]             [7][0][7][5][2][1][6][4][3]

In this two matrix we can find the additive and multiplicative inverse for everyone of values  in  the
finished space mod(8). The additive inverse value are the value that in the table have as  results  0,
the multiplicative inverse values are rapresentated from the values that have as results 1. So we know
that :

           ------- ------------- -------------
          | value | add inverse | mul inverse |
           ------- ------------- -------------
          |   0   |      0      |      /      |
          |   1   |      1      |      1      |
          |   2   |      2      |      5      |
          |   3   |      3      |      6      |
          |   4   |      4      |      7      |
          |   5   |      5      |      2      |
          |   6   |      6      |      3      |
          |   7   |      7      |      4      |
           ------- ------------- -------------

This values are used very frequently in a encryption algorithms

---------------[ 0x3 - The Rijndael Advanced Encryption Standard ciphrature's structure introduction ]
The Aes ciphrature is a particoular ciphrature that use a modulate arithmetic to  obtain   a  ciphrate
text. There are several implementation of this ciphrature algorithm, one of this is named Rijndael AES
ciphrature. This ciphrature is different from the DES  algorithm   since  AES don't  use  the  Feistel
structure in witch the block of text are divided in two part and only one of this is used to  encript;
in the Rijndael implementation, all the bits of input text are ciphrated in  parallels  way. The input
key can have three different dimension : 128, 192 or 256 bits that are equals to 4,6 or 8  word  since
8bits = 1 byte and 4 byte = 1 word. The  input text can be only of 128 bits. In this text we  consider
that the key is composit of 128bits too. When the AES algorithm take the 128 bits of key (a key  of  4
word) it  cerat 44 differents word  of 32 bits that are equals to 10  different  keys  since  we  must
repeat 10 times the same phase. The Aes algorithm can be interpretated in this way: we have a 128 bits
of input text, in the first time this text go in input with one key of 4  words,  the  ouput  of  this
procedure go in input to the first phase that are composit by four different subphases that we see  in
the second time, the output of the first phase go in input to the second phase that is the same of the
first. This operation,that are util to create confusion but not to incremet the security  level  since
don't use the key, are repetuted for 9 times, only one of the  4 subphases use the  key and it is  the
most important of  their. After that we send  the  oputut  of  they  in  input  to  another  phase the
phase  number 10  that  is the last; this  phase is composit only of 3  subphases and the last of this
three phases is the same procedure that we have executed when the procedure  is  started. Now  we  can
imagine the AES encrypt and decrypt procedures as showed next: 

   [ Encrypt procedure ]                            [ Decrypt procedure ]
 --------------------------                      ---------------------------
|                          |                    |                           |
 -------[Input text]-------                      -------[Output text]-------
             |                  -----[key]                      ^
             |                 |       |                        |   -------------
             V                 |       V                        |                | p
      [Add round key]<---------|--- w[0,3] ------------->[Add round key]         | h
             |                 V                                ^                | a
      - - - - - - - -        [key expansion]                    |                | s
   -------   |                 |                        [Inverse sub bytes]      | e
  |          V                 |                                ^                |
p |  [Substitute bytes]        |                                |                | 1
h |          |                 |                        [Inverse shift rows]     | 0
a |          V                 |                                ^   -------------
s |    [Shift rows]            |                                |                | p
e |          |                 |                       [Inverse mix columns]     | h
  |          V                 |-------                         ^                | a
1 |   [Mix  columns]           |       |                        |                | s
  |          |                 |       |         ------->[Add round key]         | e
  |          V                 |       V        |               ^                |
  |   [Add round key]<---------|--- w[4,7]------                |                | 9
  |          |                 |                        [Inverse sub bytes]      |
   -------   V                 |                                ^                |
            ...                |                                |                |
   -------   |                 |                       [Inverse  shift rows]     |
  |          V                 |                                ^   -------------
p |  [Substitute bytes]        |                                |
h |          |                 |                               ...
a |          V                 |                                ^   -------------
s |    [Shift rows]            |                                |                | p
e |          |                 |                       [Inverse mix columns]     | h
  |          V                 |-------                         ^                | a
9 |   [Mix  columns]           |       |                        |                | s
  |          |                 |       |           ----->[Add round key]         | e
  |          V                 |       V          |             ^                |
  |   [Add round key]<---------|--- w[36,39]------              |                | 1
  |          |                 |                        [Inverse sub bytes]      |
   -------   V                 |                                ^                |
p |  [Substitute bytes]        |                                |                |
h |          |                 |                        [Inverse shift rows]     |
a |          V                  -------                         ^   -------------
s |     [Shift rows]                   |                        |
e |          |                         |           ----->[Add round key]
  |          V                         V          |             ^
1 |    [Add round key]<------------ w[40,43]------              |
0 |          |                                                  |
   -------   V                                                  |
       [Ciphred text]                                     [Ciphred text]

As you can see the ciphrature procedure and the deciphrature procedure are composit of 10  phases. The
phase number 10 is composit of 3 subphases and the others phase are composit of 4 subphases  for  boot
the procedure. In the ciphrature execution the key used in the first phase is the 128  bits  that  the
user insert in input, the others keys are all keys of 4  words but are created from  the  original. In
the deciphration execution (that started from the bottom in the right section of the image) the  first
key used is the last used in the ciphrature procedure. The subphases used in the ciphrature  procedure
are : Substitute bytes, Shift rows, Mix columns and Add round  key;  in  the decript   procedure  this
subphase are named Inverse Substitute bytes, Inverse Shift rows, Inverse Mix Columns and Add round key
In the ciphrature and deciphrature procedure are exchanged the order  of  execution  between Add round
key and Mix columns/Inverse Mix columns. In this text we consider that the key insert in input by  the
user is composit by 128 bits that are 16 byte that are equals to 4 word;  as we can see in the  image,
this key in used to obtain 10 different keys so if one key is rapresentated by 4  words, the algorithm
must generate 44 different words  to obtain a different key for every one of the  10  phases. The next
table show the case in witch we want implements this algorithm using 192 or 256 bits for the key:

 ---------------------------------------------------- ------------ ------------ ------------
| Key dimension (words,byte,bits)                    | (4/16/128) | (6/24/192) | (8/32/256) |
| Input text dimension(w,b,b)                        | (4/16/128) | (4/16/128) | (4/16/128) |
| Phase's number                                     |    10      |    12      |    14      |
| Dimension of the Key for a single phase (w,b,b)    | (4/16/128) | (4/16/128) | (4/16/128) |
| Dimension of extension key (w,b)                   | (44 / 176) | (52 / 208) | (60 / 240) |
 ---------------------------------------------------- ------------ ------------ ------------

When we insert the 128 bits of original text that we want encrypt, this bits are stored in a  quadrate
matrix. It is a byte quadrate matrix that are copied  in  a  array  named  State,  during  the  phases
execution this array are changed and in the end it contain the ciphred  text. During the  deciphration
procedure is the same. So we can imagine the 128 bits stored in this way: (128 bits are equals  to  16
bytes and 16 bytes rapresents 4 words)

  (word 1)  (word 2)  (word 3)  (word 4)
      |         |         |         |
      V         V         V         V   -----
  --------- --------- --------- ---------    |
 | input00 | input04 | input08 | input12 |   |
  --------- --------- --------- ---------    | 4x4 words array : every one of the input parts is
 | input01 | input05 | input09 | input13 |   |                    composit of 1 byte (8 bits) 
  --------- --------- --------- ---------    |
 | input02 | input06 | input10 | input14 |   |
  --------- --------- --------- ---------    |
 | input03 | input07 | input11 | input15 |   |
  --------- --------- --------- ---------    |
                                        -----

During the 10 phases this matrix is changed as we can see in the next image : 

       [ input ]               [ phase 1 ]               [ phase 10 ]               [ output ]
  ---- ---- ---- ----      ---- ---- ---- ----        ---- ---- ---- ----      ---- ---- ---- ----
 |in00|in04|in08|in12|    |s0,0|s0,1|s0,2|s0,3|      |s0,0|s0,1|s0,2|s0,3|    |ou00|ou04|ou08|ou12|
 |in01|in05|in09|in13|--->|s1,0|s1,1|s1,2|s1,3|.....>|s1,0|s1,1|s1,2|s0,3|--->|ou01|ou05|ou09|ou13|
 |in02|in06|in10|in14|    |s2,0|s2,1|s2,2|s2,3|      |s2,0|s2,1|s2,2|s0,3|    |ou02|ou06|ou10|ou14|
 |in03|in07|in11|in15|    |s3,0|s3,1|s3,2|s3,3|      |s3,0|s3,1|s3,2|s0,3|    |ou03|ou07|ou11|ou15|
  ---- ---- ---- ----      ---- ---- ---- ----        ---- ---- ---- ----      ---- ---- ---- ----

The same happened for the key that is composit of 128 bits, so we can imagine that  it  is  stored  in
this way:

  ----- ----- ----- -----
 | k00 | k04 | k08 | k12 |
 | k01 | k05 | k09 | k13 |
 | k02 | k06 | k10 | k14 |
 | k03 | k07 | k11 | k15 |
  ----- ----- ----- -----

------------------------------------------------[ 0x3x1 - The subphases of a single encryption phase ]
As we have seid in  the  prev  paragraphe a  single phase of AES ciphrature algorithm is composit of 4
differents subphases. One of they use the key and increment the security level, the  others  are  used
only to create confusion. 

The first subphase that we see is named Substitute bytes. This procedure  use  a  16x16  matrix  named
s-box in witch are presents all possible permutations that we can obtain exchenging of position 8 bits
Since we have 8 bits we can obtain 256 differents values. The best way to store the value  in  a 16x16
matrix is using a value rapesentated in hexadecimal form. In the first time the s-box  is  composit in
this way:

| 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F |
| 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 2A | 2B | 2C | 2D | 2E | 2F |
| 30 | 31 | 32 | 03 | 34 | 35 | 36 | 37 | 38 | 39 | 3A | 3B | 3C | 3D | 3E | 3F |
| 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 4A | 4B | 4C | 4D | 4E | 4F |
 ...
| 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 9A | 9B | 9C | 9D | 9E | 9F |
| A0 | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | A9 | AA | AB | AC | AD | AE | AF |
| B0 | B1 | B2 | B3 | B4 | B5 | B6 | B7 | B8 | B9 | BA | BB | BC | BD | BE | BF |
| C0 | C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | CA | CB | CC | CD | CE | CF |
 ...
| F0 | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | FA | FB | FC | FD | FE | FF |

After that we must substitute every one element of this s-box with its multiplicative inverse (that we
have seed in the prev paragraphe). The multiplicative inverse is calcoulated on GF(2^8) finished space
and not GF(2^3) as we did in our exemple in the first paragraphe. After that  we  have  exchanged  all
values of the matrix with their multiplicative inverse values, since  all  values  are  rappresentable
with 8 bits we must obtain a new value apply for every one value this operation: 
b(i) = b(i) (+) b(i+4)mod(8) (+) b(i+5)mod(8) (+) b(i+6)mod(8) (+) b(i+7)mod(8) (+) c(i)
where (+) = xor operation bit to bit and c = 01100011. To understend this  operation  I  take  you  an
exemple. Immagine that in one position of our matrix after the change we have obtenuted the hex values
A6 that in binary is 10100110 so we know that 
b[0] = 0
b[1] = 1
b[2] = 1
b[3] = 0
b[4] = 0
b[5] = 1
b[6] = 0
b[7] = 1
and we must apply for every one bit of A6 the operation showed first. So : 

b[4] = b[4] (xor) b[0] (xor) b[1] (xor) b[2] (xor) b[3] (xor) c[4] 
b[4] =  0   (xor)  0   (xor)  1   (xor)  1   (xor)  0   (xor)  0
b[4] =  0

If you think that for this operation is necessary a lot of time, you must think  that  this  operation
is execute by a calcoulatore and a xor operation is one of base operation that a calcoulator  execute.
The results of this operations is the next matrix:

                                             columns
         ----------------------------------------------------------------------------------
        |                                                                                  |
   ---    | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F |
  |  | 00 | 63 | 7C | 77 | 7B | F2 | 6B | 6F | C5 | 30 | 01 | 67 | 2B | FE | D7 | AB | 76 |
  |  | 01 | CA | 82 | C9 | 7D | FA | 59 | 47 | F0 | AD | D4 | A2 | AF | 9C | A4 | 72 | C0 |
r |  | 02 | B7 | FD | 93 | 26 | 36 | EF | F7 | CC | 34 | A5 | ES | F1 | 71 | D8 | 31 | 15 |
o |  | 03 | 04 | C7 | 23 | C3 | 18 | 96 | 05 | 9A | 07 | 12 | 80 | E2 | EB | 27 | B2 | 75 |
w |  | 04 | 09 | 83 | 2C | 1A | 1B | 6E | 5A | A0 | 52 | 3B | D6 | B3 | 29 | E3 | 2F | 84 |
s |  | 05 | 53 | D1 | 00 | ED | 20 | FC | B1 | 5B | 6A | CB | BE | 39 | 4A | 4C | 58 | CF |
  |  | 06 | D0 | EF | AA | FB | 43 | 4D | 33 | 85 | 45 | F9 | 02 | 7F | 50 | 3C | 9F | A8 |
  |  | 07 | 51 | A3 | 40 | 8F | 92 | 9D | 38 | FA | BC | B6 | DA | 21 | 10 | FF | F3 | D2 |
  |  | 08 | CD | 0C | 13 | EC | 5F | 97 | 44 | 17 | C4 | A7 | 7E | 3D | 64 | 5D | 19 | 73 |
  |  | 09 | 60 | 81 | 4F | DC | 22 | 2A | 90 | 88 | 46 | EE | B8 | 14 | DE | 5E | 0B | DB |
  |  | 0A | E0 | 32 | 3A | 0A | 49 | 06 | 24 | 5C | C2 | D3 | AC | 62 | 91 | 95 | E4 | 79 |
  |  | 0B | E7 | C8 | 37 | 6D | 8D | D5 | 4E | A9 | 6C | 56 | F4 | EA | 65 | 7A | AE | 08 |
  |  | 0C | BA | 78 | 25 | 2E | 1C | A6 | B4 | C6 | E8 | DD | 74 | 1F | 4B | BD | 8B | 8A |
  |  | 0D | 70 | 3E | B5 | 66 | 48 | 03 | F6 | 0E | 61 | 35 | 57 | B9 | 86 | C1 | 1D | 9E |
  |  | 0E | E1 | F8 | 98 | 11 | 69 | D9 | 8E | 94 | 9B | 1E | 87 | E9 | CE | 55 | 28 | DF |
  |  | 0F | 8C | A1 | 89 | 0D | BF | E6 | 42 | 68 | 41 | 99 | 2D | 0F | B0 | 54 | BB | 16 |
   ---


We know that the inputs of AES ciphrature algorithm are a key of 128bits and a text input that we want
entrypt of 128bits too. This 128bits of text are stored in a matrix and copied in a array named State.
During the execution of Substitute bytes procedure this array  is  changed  using  the  s-box. We  can
imagine an input matrix as showed in the next image: 

  word1  word2  word3  word4
    |      |      |      |
    V      V      V      V 
 ------ ------ ------ ------
|  EA  |  04  |  65  |  85  |
|  83  |  45  |  5D  |  96  |
|  5C  |  33  |  98  |  B0  |
|  F0  |  2D  |  AD  |  C5  |
 ------ ------ ------ ------

The value stored in it are rapresentated in hex form since we must interact with the s-box matrix that
is a 16x16 matrix. The hexadecimal value EA is 234 in decimal  rapresentation  and 11101010  in binary
The Substitute bytes change the value EA in this way: the first 4 bits 1110 that in decimal is 14  and
in hexadecimal is E rapresents the row of s-box and the second 4bits 1010 that is 10 in decimal  and A
in  hexadecimal  rapresents  the  column  of  s-box. The  value  stored  in  the s-box  at  the  index
{row = E, column = A} is 87 in hexadecimal value that in decimal is 135 and in binary  is 10000111. So
the value EA is sobstituted by the value 87. The same operation is repetuted for all hex value  stored
in the 4x4 word  matrix (that is the array State, that is the input text of 128bits). When all  values
are exchanged the final matrix is this:

 ------ ------ ------ ------
|  87  |  F2  |  4D  |  97  |
|  EC  |  6E  |  4C  |  90  |
|  4A  |  C3  |  46  |  E7  |
|  8C  |  D8  |  95  |  A6  |
 ------ ------ ------ ------

The next image can help you to understend this very simple concept

                          -------column 5---------------
                         |                              |
                         |                              V
  ---- ---- ---- ----    |         [00][01][02][03][04][05][06][07][08][09][0A][0B][0C][0D][0E][0F]
 |    |    |    |    |   |         [01][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |  -----  |    |   |         [02][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
  ---- /|     |- ----    |         [03][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    | | 45  -------------row 4-->[04][  ][  ][  ][  ][6E][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |/ -----  |    |             [05][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
  ---- ---- ---- ----              [06][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |    |    |    |             [07][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |    |    |    |             [08][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
  ---- ---- ---- ----              [09][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |    |    |    |             [0A][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
 |    |    |    |    |             [0B][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
  ---- ---- ---- ----              [0C][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
                                   [0D][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
                                   [0E][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]
                                   [0F][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ][  ]

As you can see in the image that rapresent the full aes execution, during the decryption execution  it
is a subphase named inverse sub bytes that take in input the encrypted matrix and return in output the
original value of the matrix in the time first that  it  was  encrypted  by  the substitute bytes. For
exemple we know that the aes algorithm is composit by 10 phase and every  one  phase  have a  subphase
named substitute bytes. But the substitute bytes procedure of phase 5 take in input a matrix different
from the matrix that take in input the phase number 7 and return an output different, so the output of
substitute bytes of the phase 9 in encryption process is the same matrix that we give in input to  the
second phase of decription procedure; the output of this decryptation inverse sub bytes  is  the  same
matrix that we give in input to the phase number 9 to the substitute bytes during the encryption phase
The procedure used to exchange the value from input to output matrix is the same  used  in  substitute
bytes,but the invese procedure use a 16x16 matrix in witch the elements are the invese of the elements
that we have seed in a 16x16 s-box matrix. The inverse s-box is showed next:

                                             columns
         ----------------------------------------------------------------------------------
        |                                                                                  |
   ---    | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F |
  |  | 00 | 52 | 09 | 6A | D5 | 30 | 36 | A5 | 38 | BF | 40 | A3 | 9E | 81 | F3 | D7 | FB |
  |  | 01 | 7C | E3 | 39 | 82 | 9B | 2F | FF | 87 | 34 | 8E | 43 | 44 | C4 | DE | E9 | CB |
r |  | 02 | 54 | 7B | 98 | 32 | A6 | C2 | 23 | 3D | EE | 4C | 95 | 0B | 42 | FA | C3 | 4E |
o |  | 03 | 08 | 2E | A1 | 66 | 28 | D9 | 24 | B2 | 76 | 5B | A2 | 49 | 6D | 8B | D1 | 25 |
w |  | 04 | 72 | F8 | F6 | 64 | 86 | 68 | 98 | 16 | D4 | A4 | 5C | CC | 5D | 65 | B6 | 92 |
s |  | 05 | 6C | 70 | 48 | 50 | FD | ED | B9 | DA | 5E | 15 | 46 | 57 | A7 | 8D | 9D | 84 |
  |  | 06 | 90 | D8 | AB | 00 | 8C | BC | D3 | 0A | F7 | E4 | 58 | 05 | B8 | B3 | 45 | 06 |
  |  | 07 | D0 | 2C | 1E | 8F | CA | 3F | 0F | 02 | C1 | AF | BD | 03 | 01 | 13 | 8A | 6B |
  |  | 08 | 3A | 91 | 11 | 41 | 4F | 67 | DC | EA | 97 | F2 | CF | CE | F0 | B4 | E6 | 73 |
  |  | 09 | 96 | AC | 74 | 22 | E7 | AD | 35 | 85 | E2 | F9 | 37 | E8 | 1C | 75 | DF | 6E |
  |  | 0A | 47 | F1 | 1A | 71 | 1D | 29 | C5 | 89 | 6F | B7 | 62 | 0E | AA | 18 | BE | 18 |
  |  | 0B | FC | 56 | 3E | 4B | C6 | D2 | 79 | 20 | 9A | DB | C0 | FE | 78 | CD | 5A | F4 |
  |  | 0C | 1F | DD | A8 | 33 | 88 | 07 | C7 | 31 | B1 | 12 | 10 | 59 | 27 | 80 | EC | 5F |
  |  | 0D | 60 | 51 | 7F | A9 | 19 | B5 | 4A | 0D | 2D | E5 | 7A | 9F | 93 | C9 | 9C | EF |
  |  | 0E | A0 | E0 | 3B | 4D | AE | 2A | F5 | B0 | C8 | EB | BB | 3C | 83 | 53 | 99 | 61 |
  |  | 0F | 17 | 2B | 04 | 7E | BA | 77 | D6 | 26 | E1 | 69 | 14 | 63 | 55 | 21 | 0C | 7D |
   ---

If you don't know how to convert binary to hexadecimal and vice versa you can  use  the  table  showed
next: 

 -------------
| bin  |  hex |
|------|------|
| 0000 |   0  |
| 0001 |   1  |
| 0010 |   2  |
| 0011 |   3  |
| 0100 |   4  |
| 0101 |   5  |
| 0110 |   6  |
| 0111 |   7  |
| 1000 |   8  |
| 1001 |   9  |
| 1010 |   A  |
| 1011 |   B  |
| 1100 |   C  |
| 1101 |   D  |
| 1110 |   E  |
| 1111 |   F  |
 -------------

so if I have a binary value 10101101 and I want convert it to hex value, I must divide it in group  of
4 bits for each; in this case I have two group of 4 bits : 1010 and 1101. After that I find it in  the
table, 1010 = A and 1101 = D so I know that 10101101 = AD
To obtain the binary value from an hex value I must execute the inverse operation so in the first time
show in the table that A = 1010 and D = 1101 and I know that AD = 10101101

The second subphase that we see in this text is named Shift Row, it is very simple and very useful.The
execution of this procedure consist in a succession of shift operation. At this point of text we  know
that the 128 bits of text are stored in a 4x4 word  matrix where the first  column is the  first  word
of input, the second column is the second  word etc... So when we insert in input a text, this text is
interpretated by the calculator as a succession of 128 bits and this bits  are  interpretated  in  hex
rapresentation by the algorithm. Now we can consider the  same  input  obtenuted  in  sostitute  bytes
procedure: 87 EC 4A 8C F2 6E C3 D8 4D 4C 46 95 97 90 E7 A6; we know  that  this  input of 128 bits  is
stored in a 4x4 matrix in this way:

 ------ ------ ------ ------
|  87  |  F2  |  4D  |  97  |
|  EC  |  6E  |  4C  |  90  |
|  4A  |  C3  |  46  |  E7  |
|  8C  |  D8  |  95  |  A6  |
 ------ ------ ------ ------

where the first column is the first word  since 87 is a sigle byte that is equals to 8 bits.  We  know
that every word  is composit by 4 byte that are 32 bits. Now If we work on the rows  and  not  on  the
columns we change all senses of the clear text. You must remeber that this byte array is the output of
the first subphase (the substitute byte). The Shift Row procedure don't change the first row but apply
a left shift at the rows number 2,3 and 4. For the second row we apply a left shift of a single  byte,
for the row number three we apply a shift of two bytes and for the last row we apply a left  shift  of
three bytes. When this procedure is finished the output of it is a new 4x4 matrix where  the  4  words
are shuffled as showed next :

 ------ ------ ------ ------
|  87  |  F2  |  4D  |  97  |
|  6E  |  4C  |  90  |  EC  |
|  46  |  E7  |  4A  |  C3  |
|  A6  |  8C  |  D8  |  95  |
 ------ ------ ------ ------

The third subphases as you can see in the image rapresentated in the cap 0x3 is named Mix Columns,this
procedure is used to adding other confusion exchanging the value stored in a 4x4 matrix working on the
columns of it. To execute the exchange, the mic columns procedure use another 4x4 matrix : 

 ------ ------ ------ ------
|  02  |  03  |  01  |  01  |
|  01  |  02  |  03  |  01  |
|  01  |  01  |  02  |  03  |
|  03  |  01  |  01  |  02  |
 ------ ------ ------ ------

This procedure create confusion apply a moultiplication operation in the finished space GF(2^8)  using
the polinomy x^8+x^4+x^3+x+1 as irriducible value. The mix column operation can  be  rapresentated  as
showed in the next image :

 ------ ------ ------ ------   ------- ------- ------- -------
|  02  |  03  |  01  |  01  | |  S00  |  S01  |  S02  |  S03  |
|  01  |  02  |  03  |  01  | |  S10  |  S11  |  S12  |  S13  |
|  01  |  01  |  02  |  03  | |  S20  |  S21  |  S22  |  S23  |
|  03  |  01  |  01  |  02  | |  S30  |  S31  |  S32  |  S33  |
 ------ ------ ------ ------   ------- ------- ------- -------

Where the matrix with the elements s00 s01 s02 etc is our input state matrix that is the output of the
two subphases executed in the prev times. So After the substitute bytes and  shift  row  procedure  we
have the next matrix multiplication for the execution of the third procedure of a single phase: 

 ------ ------ ------ ------   ------ ------ ------ ------
|  02  |  03  |  01  |  01  | |  87  |  F2  |  4D  |  97  |
|  01  |  02  |  03  |  01  | |  6E  |  4C  |  90  |  EC  |
|  01  |  01  |  02  |  03  | |  46  |  E7  |  4A  |  C3  |
|  03  |  01  |  01  |  02  | |  A6  |  8C  |  D8  |  95  |
 ------ ------ ------ ------   ------ ------ ------ ------

The multiplication between this two matrix is execute following this rules:

-> S0J = (2*S0J) (+) (3*S1J) (+) S2J (+) S3J
-> S1J = S0J (+) (2*S1J) (+) (3*S2J) (+) S3J
-> S2J = S0J (+) S1J (+) (2*S2J) (+) (3*S3J)
-> S3J = (3*S0J) (+) S1J (+) (2*S3J)

for exemple the element S00 =

-->         (2*S00)         (+)               (3*S10)           (+)    S20       (+)   S30
-->         (2*87)          (+)               (3*6E)            (+)     46       (+)    A6
--> (00000010 * 10000111)   xor       (00000011 * 01101110)     xor   01000110    xor 10100110
-->  [x * (x^7+x^2+x+1)]    xor  [(x+1) * (x^6+x^5+x^3+x^2+x)]  xor   x^6+x^2+x   xor x^7+x^5+x^2+x
-->    x^8+x^3+x^2+x        xor            x^7+x^4+x^5+x        xor   x^6+x^2+x   xor x^7+x^5+x^2+x 
-->   x^8+x^4+x^3+   +x+1 
      x^8+   +x^3+x^2+x     xor            x^7+x^4+x^5+x        xor   x^6+x^2+x   xor x^7+x^5+x^2+x
-->       x^4+   +x^2+ +1   xor            x^7+x^5+x^4+x        xor   x^6+x^2+x   xor x^7+x^5+x^2+x
-->          10101          xor              10110010           xor   1000110     xor 10100110 
--> 1000111 == 47

If you execute this operations on all matrix we obtain this:

 ------ ------ ------ ------
|  47  |  40  |  A3  |  4C  |
|  37  |  D4  |  70  |  9F  |
|  94  |  E4  |  3A  |  42  |
|  ED  |  A5  |  A6  |  BC  |
 ------ ------ ------ ------

The inverse procedure of it named Inverse Mix Columns used in the  decryption  phase  apply  the  same
sistem and the same operations but using the matrix showed next to reobtain the initial matrix value:

 ------ ------ ------ ------
|  0E  |  0B  |  0D  |  09  |
|  09  |  0E  |  0B  |  0D  |
|  0D  |  09  |  0E  |  0B  |
|  0B  |  0D  |  09  |  0E  |
 ------ ------ ------ ------

The last phase that we see is the most important since use the key of phase to add a security level to
a state matrix. The operation is very simple, the array state is an array of 128bits and  the  key  is
composit of 128 bits too so we can create another matrix that is a matrix that rapresents  a  courrent
key generated from the original key (you must remember that we use a different key for  every  one  of
the 10 phases). This procedure, named Add Round key is only a xor bit to bit between the  two  matrix,
the matrix of state that contain the courrent state of the text, and the matrix of  the  courrent  key
that contain one of all subkeys generated during the several phases. The  inverse  procedure  used  in
decryption phase don't change since the xor operation is invertible in fact: 10 xor 2 = 8 and 8 xor  2
= 10. Using the last array state obtenuted by the mix Columns procedure I take you an exemple of a Add
Round key execution using a casual key of 128 bits and rapresentated in a 4x4 byte array:

 ------ ------ ------ ------       ------ ------ ------ ------     ------ ------ ------ ------
|  47  |  40  |  A3  |  4C  |     |  AC  |  19  |  28  |  57  |   |  EB  |  59  |  8B  |  1B  |
|  37  |  D4  |  70  |  9F  | xor |  77  |  FA  |  D1  |  5C  | = |  40  |  2E  |  A1  |  C3  |
|  94  |  E4  |  3A  |  42  |     |  66  |  DC  |  29  |  00  |   |  F2  |  38  |  13  |  42  |
|  ED  |  A5  |  A6  |  BC  |     |  F3  |  21  |  41  |  6A  |   |  1E  |  84  |  E7  |  D2  |
 ------ ------ ------ ------       ------ ------ ------ ------     ------ ------ ------ ------

---------------------------------------[0x3x2 - The Aes key's expansion from 4 to 44  word procedure ]
In the prevs paragraphes we said that only one of four subphases use a key to increment  the  security
level of ciphrature and that the key used during the execution of the 10 phases is not the same but it
change for every one different phase. There is an algorithm that use the key of 128 bits that the user
insert in input to generate all 10 subkey of 128bits. We know that 128 bits are 4  word, the algorithm
is used to generate 44  words. Imagine that the input key of 128bits rapresentated in hex form is this

      original key of 128 bits / 16 bytes / 4 words
   ----------------------------------------------------
  |     word 1      word 2       word 4      word 4    |
  | -----------  -----------  -----------  ----------- |
   |           ||           ||           ||           |
--> EA D2 73 21  B5 8D BA D2  31 2B F5 60  7F BD 29 2F

This key is composit of 4 words, every word is composit of 4 byte end every byte is composit of 8 bits
so every word is composit of 32 bits. To generate other 40 words to obtain 44 words, one for every one
phase of aes encryption, the algorithm works in this way:

pass[1] -> expand the arrey that contain the 4 words of key to an array that have the space to preserv
           44 words. The new array is composit by 1408 bits that are 176 byte that are  equals  to  44
           words.
pass[2] -> create a temp-word that are equals to the word prev of it so temp = 7F BD 29 2F
pass[3] -> execution of RotWord procedure that apply a left shift of one byte so temp = BD 29 2F 7F
pass[4] -> execution of SubWord that exchange the value of a single byte using the s-box showed in the
           prev chapter so temp = 5D A5 15 D2
pass[5] -> execution of Rcon(1) that is a xor operation between the courrent value of temp and a value
           of RC[j] where j is the number of subkey that we must create (in this case we are  creating
           the subkey for the first phase so we must apply the xor between the courrent value of  temp
           and the value of RC[1]. So temp = 46 A5 15 D2
           The next table show the value of RC[j]:
           RC[01] = 01 00 00 00
           RC[02] = 02 00 00 00
           RC[03] = 04 00 00 00
           RC[04] = 08 00 00 00
           RC[05] = 10 00 00 00
           RC[06] = 20 00 00 00
           RC[07] = 40 00 00 00
           RC[08] = 80 00 00 00
           RC[09] = 1B 00 00 00
           RC[10] = 36 00 00 00
pass[6] -> execute a xor operation between the courrent value of temp and the 4 word  prev  of  it. In
           this case EA D2 73 21 xor 46 A5 15 D2 so temp = AC 77 66 F3 (this is the first word of  the
           new key.

Now we can create the complete key from the original:
   word 1        word 2        word 3        word 4
 ------------  ------------  ------------  ------------
  EA D2 73 21   B5 8D BA D2   31 2B F5 60  7F BD 29 2F
new key :
   word 5        word 6        word 7        word 8
 ------------  ------------  ------------  ------------
 [ ][ ][ ][ ]  [ ][ ][ ][ ]  [ ][ ][ ][ ]  [ ][ ][ ][ ]
 ------------  ------------  ------------  ------------

           Temp       RootWord      SubWord        RC(1)        xor       word(i)-4    finalword
word5 = 7F BD 29 2F  8D 29 2F 7F  5D A5 15 D2  1B 00 00 00  46 A5 15 D2  EA D2 73 21  AC 77 66 F3
word6 = AC 77 66 F3  77 66 F3 AC  FA 33 0D 91  1B 00 00 00  E1 33 0D 91  B5 8D BA D2  54 BE B7 43
word7 = 54 BE B7 43  BE B7 43 54  AE A9 1A 20  1B 00 00 00  B5 A9 1A 20  31 2B F5 60  84 82 EF 40
word8 = 84 82 EF 40  82 EF 40 84  13 DF 09 5F  1B 00 00 00  08 DF 09 5F  7F BD 29 2F  77 62 20 70

the new key that the AES use in phase 1 is this: AC 77 66 F3 53 BE B7 43 84 82 EF 40 77 62 20 70