| Konferenzartikel[LMT:00] | Götz Lindenmaier, Kathryn S. McKinley, Olivier Temam, Load Scheduling with Profile Information, Arndt Bode and Thomas Ludwig and Roland Wismüller (Ed.), Euro-Par 2000 -- Parallel Processing, p. 223-233, Springer Verlag, Aug 2000.
|
ZusammenfassungWithin the past five years, many manufactures
have added hardware performance counters to their microprocessors to
generate profile data cheaply. We show how to use Compaq's DCPI tool to
determine load latencies which are at a fine, instruction granularity and
use them as fodder for improving instruction scheduling. We validate our
heuristic for using DCPI latency data to classify loads as hits and misses
against simulation numbers. We map our classification into the Multiflow
compiler's intermediate representation, and use a locality sensitive
Balanced scheduling algorithm. Our experiments illustrate that our
algorithm improves run times by 1% on average, but up to 10% on a Compaq
Alpha.
Autoren
| |