This is the final video of the lesson,
and we'll learn three more practical things about floating-point numbers.
How to reorder computations to minimize errors?
What alternative a double type has?
And what are the special values of a double?
Usually, there are several ways to compute something.
Mathematically, they could be equivalent.
But as layers draw differently,
when we use a floating-point numbers,
it could make sense to reorder computations as to minimize the errors.
Let's look at an example.
If you try to compute 10 to the 18th plus one,
minus 10 to the 18th,
you won't get a one but a zero.
Earlier we've seen, that's when adding a floating-point numbers,
the precision of the smaller could get around it.
And here, 10 to the 18th is relatively big,
that the one is lost completely.
So, the conclusion is trying out not to add values of different magnitudes.
Think of the following equality,
x squared minus y squared is equal to x minus y times x plus y.
Mathematically, the equality always holds.
But for doubles, there could be a difference.
If y is equal to a billion and x is equal to y plus one,
then the left part gives us two billion,
and the right part gives us two billion plus one.
The latter is correct. But why is one lost in the format?
Whoever compute x squared minus y squared,
it's in fact, 10 to the 18th minus two billion plus one.
And we've already seen that in such additions, one will be lost.
So, that's an example of how we can reorder computations as to minimize errors.
Because when we compute x minus y times x plus y,
they're much smaller, values get added and subtracted and so, there are smaller errors.
Although double is the most standard floating-point type,
there are other types in different languages.
In fact, the word double didn't come out of nowhere.
It means double precision,
as opposed to a single precision.
You see this in Java, the single precision floating-point type is called float.
It has 32 bits,
twice less than the double,
and it stores only 24 leading binary digits and seven bits of exponents.
24 binary digits is nearly equal to seven decimal digits.
So, it's rather small.
And in fact, you should never use this type of computations.
Only if you need to violate the precision and very high performance.
In fact, there is also a chance to get a larger version of double.
In C++, there is a long double,
which has the size of either 80 bits or 64 bits as the usual double.
It depends on the version of the compiler.
English C++ compiler is usually 80 bits and in my actual theory of C++,
it's usually 64 bits.
You could tell what actually is on your compiler by looking at the following constants.
It tells you how much leading digits are stored in this type.
So, it will be 64 for 80 bits double,
and it will be 53 as the original for the usual double.
So, the long double stores 11 extra digits if it has 80 bits.
Hence, 11 extra digits could be useful,
you could be slightly less careful with choosing epsilon or with other computations.
But of course, in general,
other issue of errors still remains.
Also, in Java and Python,
there is arbitrary precision floating-point type.
It's called, Big Decimal or decimal corresponding name.
It could store as many leading digits as you set.
But of course, the more digits you are storing,
the more space and time it takes.
And you can't just take extremely minor digits,
because the operations will take it a long time to complete.
In fact, in computations the usual double type nearly always
suffice if you keep in mind how errors behave.
Due to the structure of the double type,
it could keep some values which you can't find in integers.
For example, if you divide integer one by integer zero, your program will crash.
But for the doubles, it will divide all right.
And you get a special value, positive infinity.
Both these values satisfy some natural properties,
like the positive infinity is greater than any number,
the infinity plus one is infinity and so on.
But what if you subtract infinity from infinity?
What will you get? It's hard to imagine some reasonable result.
But in fact, there is another special value for that,
a NaN, which stands for Not a Number.
There are other ways to obtain this value,
but in our cases,
it means that the result is somehow undefined.
It also has kind of natural behavior.
For example, every operation is NaN will return
a NaN and any comparison except checking non-quality, will return false.
Although those values can be useful,
you probably don't want Inf on NaN appearing in your output,
as the answer is usually a regular number.
So, if you see Inf or NaN,
it usually means that you have a bug somewhere.
And the most dangerous, are bugs
where irregular value turns to Inf or NaN because of errors.
For example, square root of a negative value is always a NaN,
even if that value is close to zero.
So if your actual value is zero
and you store minus one billion instead because of errors,
square root of minus one billion is a NaN.
So in fact, to avoid this,
you can just write instead of square root of x,
write square root of maximum of x and zero.
And as you probably wasn't going to
instruct to extract the square root of a negative number,
don't change you'll be in that very case.
When because of errors extracting the square root of zero becomes not a number.
So, let's summarize what we learned about floating-point numbers.
First of all, the doubles,
there are always errors.
So, you can only expect that your stored value is not far from the actual,
but never that they are exactly equal.
Because of that, you should always use integers if possible.
But you need to watch for over flow.
If you decides to use doubles though,
any comparison you made, should admit an error of epsilon.
And in particular, you should never use the equality rather as doubles.
Try to minimize errors.
For example, by reordering computations and avoid two values of different magnitude.
Also, avoid Inf and NaN in the output.
If there are, you most likely have a bug.
And that will be all about floating-point numbers.
So, see you in the next lesson.