creusillet@cri.mines-paristech.fr
http://www.cri.mines-paristech.fr/~creusil
A recent type of analysis (see Thomas Brandes and Paul Feautrier) has opened up wide perspectives in this area. It provides an exact analysis of array data flow, originally in monoprocedural programs with static control. This last constraint has since been partially removed by Vadim Maslov and Jean-François Collard, at the expense of accuracy. Furthermore, this method has not yet been extended to interprocedural analyses, and its complexity makes it useless on large programs.
Another approach is to compute conservative summaries of the effects of procedure calls on sets of array elements (Rémi Triolet, David Callahan and Ken Kennedy). Their relatively weak complexity (in practice) allows the analysis of large programs. But since these analyses are flow insensitive, and since they do not precisely take into account the modifications of the values of integer scalar variables, they are not accurate enough to support powerful optimizations.
In PIPS, the Interprocedural Parallelizer of Scientific Programs developed at École des mines de Paris, we have extended Triolet's array regions to compute summaries that exactly represent the effects of statements and procedures on sets of array elements, whenever possible; whereas the regions originally defined by Triolet were over-approximations of these effects. The computation of READ and WRITE regions relies on a preliminary analysis of the exact effects of statements and procedures upon array elements.
The resulting regions are already used in PIPS to enhance the dependence analysis when there are procedure calls, and to efficiently compile HPF. However, they cannot be used to compute array data flow, and are thus insufficient for other optimizations such as array privatization.
We have therefore introduced two new types of exact regions: for any statement or procedure, IN regions contain its imported array elements, and OUT regions represent its set of live array elements.
The possible applications are numerous. These regions are already used in PIPS to privatize array sections. In massively parallel or heterogeneous systems, they can also be used to calculate the communications before and after the execution of a piece of code. For a hierarchical memory machine, they provide the sets of array elements that are used or reused, and hence should be prefetched (IN regions) or kept (OUT regions) in caches or local memories; the array elements that do not appear in these sets and that are accessed in the current piece of code, are only temporaries, and should be handled as such. In fault-tolerant systems where the current state is regularly saved by a software component (checkpointing) IN or OUT regions could provide the set of elements that will be used in further computations, and thus could be used to reduce the amount of data to be saved. Other applications would be the verification of software specifications, support for out-of-core computations...
To support the exactness of the analysis, an accurate interprocedural translation is needed. However, by examining the Perfect Club benchmarks, we found out that the existing methods for handling array reshapes were insufficient. We have implemented in PIPS a general linear framework that handles array reshapes in most cases, including when the arrays are not of the same type, or belong to a COMMON which does not have the same layout in the caller and the callee.
The following sections give some more details about array regions.
<A(phi1,phi2)-W-MUST-{phi1==I, phi1==phi2}>where phi1 et phi2 respectively represent the first and second dimensions of A, corresponds to an assignement of the element A(I,I).
In order to manipulate regions and propagate them along the control flow graph of the program, several binary operators are needed: union, intersection, and difference. Unary operators are also used to eliminate the variables of the program that appear in the region predicates and that are modified. See some related publications for more details on these operators.
IN regions contain the array elements, whose values are ( EXACT) or may be (MAY) imported by the current piece of code. These are the elements that are read before being possibly redefined by another instruction of the same fragment.
OUT regions corresponding to a piece of code contain the array elements that it defines, and that are ( EXACT) or may be (MAY) used afterwards, in the execution order of the program. These are the live or exported array elements.
The OUT region of a statement is not defiend per se, but depends on the future of the computation. For instance, the OUT region of S1 in program S1;S2 is a function of S1;S2 as a whole, and of S2. Thus, OUT regions are propagateed in a top-down fashion along the call graph and hierarchical control flow graph of the program. Since I/O operations are part of the program, the OUT regions of the main program, from which the other OUT regions are derived, is initialized to the empty set.
On the contrary, the interprocedural propagation of OUT regions is a forward analysis. The regions of all the call sites are first translated from the callers'name space into the callee's name space, and are then merged to form a unique summary.
Because the source and target arrays may not have the same declaration (array reshaping), the translation is not a straightforward operation.
In his thesis, Triolet gave conditions to perform the translation by simply changing the name of the array, and projecting the polyhedron onto the name space of the caller. These conditions are that the formal array is either similar to the actual array, or a subarray of the actual array, for instance a single column of a matrix.
For the Perfect Club Benchmarks, these conditions are too restrictive. In particular, they cannot handle offsets in declarations, general offsets in the references at call site, and array reshaping is rarely treated.
In order to handle the general case, subscript values (see the FORTRAN standard) are used. The subscript value of an array element is its rank in the array, given the fact that array elements are held in column order. This is equivalent to a linearization of the array. However, this may lead to non-linear expressions that are not handled in PIPS. The method tries instead to find similar dimensions, which are translated in a trivial way. The subscript values are then used for the remaining dimensions.
The translation from formal to real parameters, as well as the translation of global variables (arrays in COMMONs, even in case of a partial matching), are supported. The translation of arrays with different types of elements is also performed.