A site for solving at least some of your technical problems...
A site for solving at least some of your technical problems...
Ada variables are complex objects. When developing a compiler, you must definitively take that in account. You have several sides to your variables: one you need to be able to handle dynamic variables since the compiler will have to be capable of doing all the operations on all the constant variables just as if you were executing your program and it has to handle all the tests necessary to ensure integrity.
So... we need a library that can handle integers, a library to handle floating points, a library to handle arrays, etc.
Here I present my current idea of an integer record. Note that fixed points can generally be handled by an integer library, yet many shifting is necessary (even if just virtual--i.e. one byte or word being ignored.)
The following are fields that define an integer. It represents the integer itself, and includes many flags defining the status of the integer.
The value may not fit in a basic type in which case the data will be pointed to. The record includes a flag used to define whether the value is inlined or allocated separately:
With such a definition, the access Integer could be recursive. We want to make sure that access Integer points to a native type (i.e. with inline_value = True.)
Note that there are no defaults to those fields, except the value which is set to the user defined default. It is important to note that the value field is not set by default. That field raises an exception when read before set. (see Status - Initialized)
The size parameter must be an inline integer. Also, once optimized it becomes optional for integers that use a fixed number of bytes for its value since those do not need to use a dynamic size.
Values are used for several entries:
For one type, all of its values are defined in the same way. However, a dynamic integer may have dynamic entries that do not match one to one in their binary form.
Computations on these objects may generate errors. Those are reported directly in the object and a function can be called to deal with the status accordingly. For instance, one may want to ignore errors that wrap around values.
The overflow flag is set whenever the result of a computation is too large to fit in the value. Too large means larger than the maximum value of the data type.
By default, generate an exception on an overflow. Set overflow_mask to False to avoid the exceptions.
The overflowed flag is set to True whenever a computation results in an error. The result of the computation is still saved in the value field. The last valid value is kept as such (i.e. a valid value for the variable.)
This is the same as the Overflow when the result of a computation is lower than the minimum.
Note that in some circumstances we cannot easily determine whether the result is too small or too large. In that case, both the Overflow and Underflow flags are set.
The mask works like with Overflow.
The precision flag is masked by default. It is used to signal errors in computations that fit the value properly but generate losses of bits. Computations will be optimized whenever precision flagging is not required.
A simple example of precision loss is:
V := 3 / 2;
As we can see, 3 is odd. In this expression, we divide by 3 by 2 which returns 1 in integer math. Now when we apply the opposite operation: 1 x 2 we do not get 3 again. This is a loss of precision. In most cases, this is not turned on since that is the expected behavior of an integer.
Precision math with integer checks divisions and modulus. Fixed points also checks shifts that happen when copying a value from two different types when the number of bits after the decimal point is smaller in the destination.
There can be some constraint to the values defined on a type other than just the minimum and maximum values. Constraints are functions attached to a type. For instance, you could define a type as a value from -100 to +100 that only accepts even numbers. The constraint can be written as:
type my_percent is range -100 .. 100; function even_only(value: my_percent'without_constraint) Return Boolean begin return (value and 1) = 0; end even_only; pragma dynamic_scalar_constraint(my_percent, even_only);
The pragma attaches the even_only() function to the type. Every time the value of a variable defined as that type is set, the function is called. If the function returns True, the value is accepted. If the function returns False, the set fails with an exception being raised.
Note that the function is also used for the Succ() and Pred(). The Succ() implementation is something like this:
Function type_succ(value: Type) Return value'Type Begin -- note that result is not given the dynamic scalar constraint result: value'Type'without_constraint := value; Do Loop result := result + 1; -- range check not shown... exit when value'Type'dynamic_scalar_constraint(result); End Loop; return result; End type_succ;
Note: we may want to implement each Type with an automatic subtype that does not include the constraint. That way, we can use that sub-type in places like this. This subtype may be called "without_constraint" and as such we know that it does not need to itself have such a subtype (that pointer can loop onto the type itself.)
Note that we also implement the range constraint in that way. The pointer may be null if there is no range constraint (i.e. INTEGER.)
We also want to support pragmas to create a type_succ() and type_pred() so it is effectively a lot faster than the default function in the event next/previous values have large gaps in between or quite complicated validity checks that would slow down the system quite much otherwise.
It would be possible to declare functions with "the correct name and parameter type". Yet, those are reserved to the user so I think it is preferable to have pragmas. Plus, having functions with the correct type sounds like magic (uncontrolled behavior.)
In order to accommodate large numbers, we want to have a large number library. This library will handle numbers of arbitrary length (i.e. 4096 bits numbers). When handling small values with those numbers, we want to be able to use small buffers. In this case, we want to be able to resize the value buffer as required with time.
We want to have a flag to ensure that we know that an integer is dynamic. The flag prevents the compiler from optimizing out the size parameter of the variable when a variable is dynamic. Note that the dynamism may be optimized out when it is possible to tell that it is not required (i.e. if the value is defined between -256 and +256, it fits in 16 bits and thus we do not need to use Dynamic.)
By default values are considered static (pre-allocated with a static size.) However, this flag may be set to True at run time.
It is very important for the optimizer to know whether a value is a constant. A constant can completely be optimized out since it does not change. Not only that, it can be converted to its machine code in the final program (watch out for necessary debug information.)
All values have a type definition. All values have a pointer to their type (which may be implied in some cases. In those cases, the type will somehow be saved in the debug data of the program.)
The type includes all the data that does not need to be defined in the value directly (i.e. Precision mask.)
In the other fields, the keyword [TYPE] is noted when the type defines that value and not directly the object.
Types are themselves composed of objects which means that some basic types need to be declared internally to get started.
The definition of value comes along with a large set of functions. With quite heavy optimization (i.e. knowing that some variables are constants), the result may end up being one assembly language instruction.
The operation in this statement:
a := b + c;
is handled with a call as follow:
procedure _integer_lib_add(out a: _variable, b: _variable, c: _variable, _raise_exception: Boolean);
Notice that the name of the function is an internal name (starts with an underscore.) The parameter a is an out only and b and c are in.
The function can then handle all the cases as required. The last parameter is used to know whether exceptions should be raised before returning. This is important since internally many operations should not raise exceptions until later.
The implementation of _integer_lib_add() is very complex. It is assumed, however, that the type of a, b and c are all the same since it is not otherwise possible to write the statement we first presented. Say a and b are of time MySmallInt and c is of type MyOtherInt, then you would have to write this to do the addition:
a := b + MySmallInt'(c);
In other words, you do not need to cast within the add function.
The first test in the function can be the type:
-- sanity test, since the type is a constant, it will be optimized out
if a.type /= b.type or a.type /= c.type then
raise ...
end if
There are limits to this because we want to be able to add constant types and those can appear as internal constant types such as in:
a := b + 3;
In that case, the compiler has to cast 3 to MySmallInt in some automatic fashion1 (3 actually uses the special integer type called Universal Integer.) Yet, again, this should be done before calling the add function.
Now that we tested the type, we want to do the addition. Assuming that we always have access to a type that is larger than the largest type the user is given access to, we can write the following, very much simplified, addition
tb := tb'Type'(b); tc := tc'Type'(c); ta := tb + tc; if ta > ta'Type(a'Type'Max_Value) then ta.overflow = True; end if if ta < ta'Type(a'Type'Min_Value) then ta.underflow = True; end if a := a'Type'(ta);
Here we assume that the + is the actual "processor level" addition (in ta := tb + tc). Notice that we first convert b and c to a new type that supports the addition without overflows or underflows. Then do the operation and compare the results.
There cannot be any loss of precision so we do not check that flag.
The basics for the division are the same as the addition. The main difference is the division itself needs to be checked for a remainder:
ta = tb / tc; tr = tb % tc; if tr /= 0 then ta.precision = True; end if
Here we get the quotient and the remainder. If the remainder is not zero, then we have a precision error.