protect_operations
, which derives fromexpaO
and provides exactly the sameset of operations, but prior to any operation checks the size of input numbers and computes bounds on the output. If the operation is at risk from overflow or underflow, it is not performed but eventually forwarded to the expression dag. No operation in
We already discussed, that sign computation and compression are free from floating-point exception, which leaves us with the ring operations addition, subtrac- tion, and multiplication. Bounding the magnitude of output summands can be done efficiently for expansions because summands are ordered by magnitude.
Let e= e1, e2, . . . , em be a zero-free, strongly non-overlapping expansion with
E= Pmi=1ei. We want to bound E from above and below in terms of the leading
summand em. Note that emmay consist of a single non-zero bit only. We have
(5.1) |E| ≤
m
X
i=1
|ei| < 2 msb(em) ≤ 2|em|,
since|e1|, |e2|, . . . , |em| is non-overlapping. Finding a lower bound is slightly more
involved. We distinguish two cases. First, let |em| and |em−1| be non-adjacent. Applying Equation (5.1) to the expansion e1, e2, . . . , em−1, we have
(5.2) |E| ≥ |em| − m−1 X i=1 ei > |em| − 2 msb(em−1) ≥12|em|.
The other case is, that|em| and |em−1| are adjacent. Then both are a power of two, and|em−1| and |em−2| are non-adjacent. Using Equation (5.2) we have
|E| ≥ |em| − |em−1| − m−2 X i=1 ei = |em−1| − m−2 X i=1 ei ≥12|em−1| =14|em|.
Combining these results, we have 1
4|em| ≤ |E| ≤ 2|em|
in any case. Let f = f1, f2, . . . , fn, be another zero-free, strongly non-overlapping
expansion with F= Pnj=1fj. We discuss addition and subtraction first, which are
free from underflow on principle. Let H= E + F. For any summand h in a strongly non-overlapping expansion representing H, we have
|h| ≤ 4|H| ≤ 4 |E| + |F| ≤ 8 |em| + | fn| ≤ 8(1 + "m) |em| ⊕ | fn|
Therefore, if
(5.3) |em| ⊕ | fn| ≤ 18τ,
we have|h| ≤ 2τ(1 − "m) and h ∈ F. FastExpansionSum does not generate interme- diate summands larger than output summands and is therefore safe from overflow if Equation (5.3) holds. Computing the difference of two expansions is safe from overflow under the same condition.
Multiplication may suffer from both overflow or underflow. We already gave criteria for TwoProduct to be safe from overflow and underflow in Section 2.2.2 and the reasoning for the general multiplication is not much different. Computing the
product of e and f using any combination of ScaleExpansion and FastExpansionSum is free from underflow, if
(5.4) e1f1= 0 or |e1|| f1| >12"m−2η,
since we never leave the ringσZ, where σ is the product of the smallest non-zero bits in e and f . Remember that e1= 0 implies m = 1 and E = 0, i.e., there is no need to check other summands in this case. Now let H= E × F and let h be a summand in a strongly non-overlapping expansion representing H. Then
|h| ≤ 16(1 + "m) |em| ⊗ | fn|,
and h∈ F, if
(5.5) |em| ⊗ | fn| ≤ 161τ.
The multiplication routine performs ScaleExpansion on f for each summand of e and then adds the intermediate expansions using FastExpansionSum. It does not create intermediate summands larger than output summands, and therefore the bounds above apply to all intermediate summands as well. If TwoProduct is based on a fused-multiply-add instruction, then no overflow occurs in a multiplication of two expansions if Equation (5.5) holds. If however TwoProduct is based on split, the splitting step may generate larger numbers. In this case, multiplication of expansions is safe from overflow, if
(5.6) max{|em|, 2dp/2e+ 1} ⊗ max{|fn|, 2dp/2e+ 1} ≤161τ.
The criteria (5.3), (5.4), (5.5), and (5.6) are used by
protO
to ensure operations are free from floating-point exceptions. We did not implement a similar approach forplaiO
, since bounding the output summands would first require to locate the mostsignificant and least significant input summand, which seemed too time consuming in an unstructured sum.
5.1.4. Conversion to Expression Dags. Arithmetic operations in
Local_dou-
ble_sum
may fail for one of three reasons. An operation which is not addition,subtraction, multiplication, or sign computation is requested, the number of sum- mands for the result of an operation, as predicted by #s, exceeds the maximum number of summands, or a floating point exception is impeding or has already oc- curred. In all of these cases, the input to the failed operation has to be transformed into an expression dag representation. We provide the following DataMediator models which may be used with
Local_double_sum
.Local_double_sum_to_expression_dag_node_mediator
(nodeM
): To con-vert a sum into an expression dag, this model computes the value of the sum exactly using bigfloat numbers and operations from the ApproximationPolicy. Then it creates a single dag node storing the result.
Expansion_to_expression_dag_node_mediator
(expaM
): This model con-verts the sum into a single dag node, too. In combination with
plaiO
, it behaves exactly likenodeM
, but for expansions the improved conversion method basedReal_algebraic Expansion_to_expression_dag_node_mediator Basic_expression_dags Leda_interval_filter_policy Mpfr_approximation Bfmss2_separation_bound Local_double_sum Double_sum_lazy_compression Double_sum_no_compression Double_sum_no_protection Double_sum_expansion_zeroelim_selfprotect_operations Double_sum_expansion_zeroelim_operations Double_sum_storage
Figure 5.4. Collaboration of classes in a RealAlgebraic variant withLocal_- double_sumas LocalPolicy.
on Monotonize from Section 4.2 is used. There is another small improvement. Let E be some number and em and em−1 the two leading summands in the unique monotone, maximally non-overlapping expansion representing E. Then |E − em| ≤ succ(|em−1|) and emand succ(|em−1|) are nearly optimal midpoint and radius of a floating-point interval containing E. Computing succ(|em−1|) is free from overflow, since|em−1| < |em|. We initialize the floating-point interval in the
dag node directly from these numbers, instead of computing it from the exact bigfloat representation.
Local_double_sum_to_expression_dag_tree_mediator
(treeM
): Directlyconverts the sum into an expression tree, more precisely a binary tree of minimal height, whose leaves store the summands and whose intermediate nodes are addition nodes. With this DataMediator,
Local_double_sum
may be seen as an expression rewriting engine, rewriting polynomial expressions over floating-point numbers into equivalent sums of floating-point numbers.Local_double_sum_to_expression_dag_mediator_statistics
(statM
):This model collects a histogram for the number of summands in sums converted to an expression dag representation. It must be instantiated with another DataMe-
diatormodel, which performs the actual conversion. These statistics allow us to study the effect different parameters have on the ability of
Local_double_sum
to defer dag creation.Another conversion alternative would be to introduce addition nodes with arity greater than two, like they are available in
CORE::Expr
2, to RealAlgebraic. That would allow a conversion strategy which is similar totreeM
, but saves the creation of all but one intermediate node.5.1.5. Collaboration of Policies. We can obtain a
Local_double_sum
variant by collecting a set of models for the policies and pass them to the host classLo-
cal_double_sum
. The code below creates a RealAlgebraic variant by replacingthe LocalPolicy in