HI! Now that we know what code profiling is and how it is done, we can start with the static code analysis. The static code analysis is a very important task, as it allows us to understand what is the complexity of the code we want to accelerate. A very interesting property that is possible to derive from this analysis is the operational intensity. We need this value, in order to predict the maximum performance attainable by our code, using the Berkeley Roofline Model. For the specific case of Field Programmable Gate Arrays, the operational intensity is defined as the ratio among the number of operations that are performed, and the memory traffic, that is the total amount of data transferred from the Data Ram , or DRAM, and the FPGA device and vice versa. It is easy to understand that the operational intensity depends on the code we want to accelerate, hence, each time we make some optimization to our code that involve changing the way the code is written, we need to perform this step from the beginning. Taking as an example this picture, we can see that usually an FPGA board includes an FPGA and some memory, in the example called DRAM. The DRAM is used to store huge amounts of data that will then be fed to the FPGA for the computation. The board, is then usually connected to a host computer via PCIe, a particular kind of serial interface. As you can see there are two types of data transfers that can be considered: the first one happens between the host and the DRAM on the FPGA board, while the second one is the one between the DRAM and the FPGA itself. This last one is the traffic we need to estimate for the calculation of the operational intensity. In order to understand how the count of the operations is performed, let’s start with a simple example of a code that performs a vector addition. Let’s define a void function called vector_add that takes 3 parameters a, b and out, that are pointers to integers. The function is basically composed of a for loop, that iterates N times. Let’s define N and set it to 100, just for this example. Inside the loop, we perform an addition between the i-th element of a and b, and we store the results in the out array. That's a very simple code! Now, for the count of the operations, we will split the operations into three categories. These categories are, arithmetical, indexing and comparison. The sum of all this categories will give us the total number of operations that are performed in our function. Now that we have written the function code, let’s see how to count this three categories of operations. Starting from the arithmetical operations, we can see that the addition is performed for each element in the two vectors a and b, hence we can start counting 100 operations for this calculation. Is that all for arithmetical operations? The answer is no! We have to consider also the index of the for loop: it is in fact incremented 100 times! Now we have accounted for all the arithmetical operations, that are in fact 200. The same needs to be done for operations of comparison. The only comparison we have in this code is the one performed in the loop, and it is performed when the index i is compared to the N. We can easily understand that this comparison is performed N times, hence we can add 100 operations to the calculation. Finally, we can count indexing operations. We can consider as indexing each operation that involves the use of an index. In the code, we can find indexing operations 3 times: here when storing the results in out, and also when accessing the i-th elements of a and b. As this code is executed 100 times, we can write down that the code is performing 300 indexing operations. If we stick to the definition given before of indexing operation, we may consider the addition of the index inside the loop condition as an indexing operation, as it involves an index. This is not a problem, it is in fact possible to account this operation as an indexing one. The important thing, is not to count twice for the same operation. The final thing we need to understand in order to obtain the operational intensity of our function, is the memory traffic. Memory traffic involves only variables that are inputs or outputs of our function. In our code, we have two variables used as input, namely a and b, and a variable that outputs the results, that we called in a very fantasious way, out. The operation performed in this function, needs to read N elements from a, N elements from b, and store N times the result of the operation into out. In this simple example, we understand that we are reading 200 and writing 100 integers. In total, we can quantify the memory traffic as 300 x 4, the size of an integer, 1200 Bytes of information. At this point it is easy to obtain the operational intensity: we just need to divide the number of operations, 600, by the amount of bytes moved, 1200. The result of this operation, 0.5 operations per byte, is the operational intensity of the analyzed code.