1 //===------ ZoneAlgo.cpp ----------------------------------------*- C++ -*-===// 2 // 3 // The LLVM Compiler Infrastructure 4 // 5 // This file is distributed under the University of Illinois Open Source 6 // License. See LICENSE.TXT for details. 7 // 8 //===----------------------------------------------------------------------===// 9 // 10 // Derive information about array elements between statements ("Zones"). 11 // 12 // The algorithms here work on the scatter space - the image space of the 13 // schedule returned by Scop::getSchedule(). We call an element in that space a 14 // "timepoint". Timepoints are lexicographically ordered such that we can 15 // defined ranges in the scatter space. We use two flavors of such ranges: 16 // Timepoint sets and zones. A timepoint set is simply a subset of the scatter 17 // space and is directly stored as isl_set. 18 // 19 // Zones are used to describe the space between timepoints as open sets, i.e. 20 // they do not contain the extrema. Using isl rational sets to express these 21 // would be overkill. We also cannot store them as the integer timepoints they 22 // contain; the (nonempty) zone between 1 and 2 would be empty and 23 // indistinguishable from e.g. the zone between 3 and 4. Also, we cannot store 24 // the integer set including the extrema; the set ]1,2[ + ]3,4[ could be 25 // coalesced to ]1,3[, although we defined the range [2,3] to be not in the set. 26 // Instead, we store the "half-open" integer extrema, including the lower bound, 27 // but excluding the upper bound. Examples: 28 // 29 // * The set { [i] : 1 <= i <= 3 } represents the zone ]0,3[ (which contains the 30 // integer points 1 and 2, but not 0 or 3) 31 // 32 // * { [1] } represents the zone ]0,1[ 33 // 34 // * { [i] : i = 1 or i = 3 } represents the zone ]0,1[ + ]2,3[ 35 // 36 // Therefore, an integer i in the set represents the zone ]i-1,i[, i.e. strictly 37 // speaking the integer points never belong to the zone. However, depending an 38 // the interpretation, one might want to include them. Part of the 39 // interpretation may not be known when the zone is constructed. 40 // 41 // Reads are assumed to always take place before writes, hence we can think of 42 // reads taking place at the beginning of a timepoint and writes at the end. 43 // 44 // Let's assume that the zone represents the lifetime of a variable. That is, 45 // the zone begins with a write that defines the value during its lifetime and 46 // ends with the last read of that value. In the following we consider whether a 47 // read/write at the beginning/ending of the lifetime zone should be within the 48 // zone or outside of it. 49 // 50 // * A read at the timepoint that starts the live-range loads the previous 51 // value. Hence, exclude the timepoint starting the zone. 52 // 53 // * A write at the timepoint that starts the live-range is not defined whether 54 // it occurs before or after the write that starts the lifetime. We do not 55 // allow this situation to occur. Hence, we include the timepoint starting the 56 // zone to determine whether they are conflicting. 57 // 58 // * A read at the timepoint that ends the live-range reads the same variable. 59 // We include the timepoint at the end of the zone to include that read into 60 // the live-range. Doing otherwise would mean that the two reads access 61 // different values, which would mean that the value they read are both alive 62 // at the same time but occupy the same variable. 63 // 64 // * A write at the timepoint that ends the live-range starts a new live-range. 65 // It must not be included in the live-range of the previous definition. 66 // 67 // All combinations of reads and writes at the endpoints are possible, but most 68 // of the time only the write->read (for instance, a live-range from definition 69 // to last use) and read->write (for instance, an unused range from last use to 70 // overwrite) and combinations are interesting (half-open ranges). write->write 71 // zones might be useful as well in some context to represent 72 // output-dependencies. 73 // 74 // @see convertZoneToTimepoints 75 // 76 // 77 // The code makes use of maps and sets in many different spaces. To not loose 78 // track in which space a set or map is expected to be in, variables holding an 79 // isl reference are usually annotated in the comments. They roughly follow isl 80 // syntax for spaces, but only the tuples, not the dimensions. The tuples have a 81 // meaning as follows: 82 // 83 // * Space[] - An unspecified tuple. Used for function parameters such that the 84 // function caller can use it for anything they like. 85 // 86 // * Domain[] - A statement instance as returned by ScopStmt::getDomain() 87 // isl_id_get_name: Stmt_<NameOfBasicBlock> 88 // isl_id_get_user: Pointer to ScopStmt 89 // 90 // * Element[] - An array element as in the range part of 91 // MemoryAccess::getAccessRelation() 92 // isl_id_get_name: MemRef_<NameOfArrayVariable> 93 // isl_id_get_user: Pointer to ScopArrayInfo 94 // 95 // * Scatter[] - Scatter space or space of timepoints 96 // Has no tuple id 97 // 98 // * Zone[] - Range between timepoints as described above 99 // Has no tuple id 100 // 101 // * ValInst[] - An llvm::Value as defined at a specific timepoint. 102 // 103 // A ValInst[] itself can be structured as one of: 104 // 105 // * [] - An unknown value. 106 // Always zero dimensions 107 // Has no tuple id 108 // 109 // * Value[] - An llvm::Value that is read-only in the SCoP, i.e. its 110 // runtime content does not depend on the timepoint. 111 // Always zero dimensions 112 // isl_id_get_name: Val_<NameOfValue> 113 // isl_id_get_user: A pointer to an llvm::Value 114 // 115 // * SCEV[...] - A synthesizable llvm::SCEV Expression. 116 // In contrast to a Value[] is has at least one dimension per 117 // SCEVAddRecExpr in the SCEV. 118 // 119 // * [Domain[] -> Value[]] - An llvm::Value that may change during the 120 // Scop's execution. 121 // The tuple itself has no id, but it wraps a map space holding a 122 // statement instance which defines the llvm::Value as the map's domain 123 // and llvm::Value itself as range. 124 // 125 // @see makeValInst() 126 // 127 // An annotation "{ Domain[] -> Scatter[] }" therefore means: A map from a 128 // statement instance to a timepoint, aka a schedule. There is only one scatter 129 // space, but most of the time multiple statements are processed in one set. 130 // This is why most of the time isl_union_map has to be used. 131 // 132 // The basic algorithm works as follows: 133 // At first we verify that the SCoP is compatible with this technique. For 134 // instance, two writes cannot write to the same location at the same statement 135 // instance because we cannot determine within the polyhedral model which one 136 // comes first. Once this was verified, we compute zones at which an array 137 // element is unused. This computation can fail if it takes too long. Then the 138 // main algorithm is executed. Because every store potentially trails an unused 139 // zone, we start at stores. We search for a scalar (MemoryKind::Value or 140 // MemoryKind::PHI) that we can map to the array element overwritten by the 141 // store, preferably one that is used by the store or at least the ScopStmt. 142 // When it does not conflict with the lifetime of the values in the array 143 // element, the map is applied and the unused zone updated as it is now used. We 144 // continue to try to map scalars to the array element until there are no more 145 // candidates to map. The algorithm is greedy in the sense that the first scalar 146 // not conflicting will be mapped. Other scalars processed later that could have 147 // fit the same unused zone will be rejected. As such the result depends on the 148 // processing order. 149 // 150 //===----------------------------------------------------------------------===// 151 152 #include "polly/ZoneAlgo.h" 153 #include "polly/ScopInfo.h" 154 #include "polly/Support/GICHelper.h" 155 #include "polly/Support/ISLTools.h" 156 #include "polly/Support/VirtualInstruction.h" 157 #include "llvm/ADT/Statistic.h" 158 #include "llvm/Support/raw_ostream.h" 159 160 #define DEBUG_TYPE "polly-zone" 161 162 STATISTIC(NumIncompatibleArrays, "Number of not zone-analyzable arrays"); 163 STATISTIC(NumCompatibleArrays, "Number of zone-analyzable arrays"); 164 STATISTIC(NumRecursivePHIs, "Number of recursive PHIs"); 165 STATISTIC(NumNormalizablePHIs, "Number of normalizable PHIs"); 166 STATISTIC(NumPHINormialization, "Number of PHI executed normalizations"); 167 168 using namespace polly; 169 using namespace llvm; 170 171 static isl::union_map computeReachingDefinition(isl::union_map Schedule, 172 isl::union_map Writes, 173 bool InclDef, bool InclRedef) { 174 return computeReachingWrite(Schedule, Writes, false, InclDef, InclRedef); 175 } 176 177 /// Compute the reaching definition of a scalar. 178 /// 179 /// Compared to computeReachingDefinition, there is just one element which is 180 /// accessed and therefore only a set if instances that accesses that element is 181 /// required. 182 /// 183 /// @param Schedule { DomainWrite[] -> Scatter[] } 184 /// @param Writes { DomainWrite[] } 185 /// @param InclDef Include the timepoint of the definition to the result. 186 /// @param InclRedef Include the timepoint of the overwrite into the result. 187 /// 188 /// @return { Scatter[] -> DomainWrite[] } 189 static isl::union_map computeScalarReachingDefinition(isl::union_map Schedule, 190 isl::union_set Writes, 191 bool InclDef, 192 bool InclRedef) { 193 // { DomainWrite[] -> Element[] } 194 isl::union_map Defs = isl::union_map::from_domain(Writes); 195 196 // { [Element[] -> Scatter[]] -> DomainWrite[] } 197 auto ReachDefs = 198 computeReachingDefinition(Schedule, Defs, InclDef, InclRedef); 199 200 // { Scatter[] -> DomainWrite[] } 201 return ReachDefs.curry().range().unwrap(); 202 } 203 204 /// Compute the reaching definition of a scalar. 205 /// 206 /// This overload accepts only a single writing statement as an isl_map, 207 /// consequently the result also is only a single isl_map. 208 /// 209 /// @param Schedule { DomainWrite[] -> Scatter[] } 210 /// @param Writes { DomainWrite[] } 211 /// @param InclDef Include the timepoint of the definition to the result. 212 /// @param InclRedef Include the timepoint of the overwrite into the result. 213 /// 214 /// @return { Scatter[] -> DomainWrite[] } 215 static isl::map computeScalarReachingDefinition(isl::union_map Schedule, 216 isl::set Writes, bool InclDef, 217 bool InclRedef) { 218 isl::space DomainSpace = Writes.get_space(); 219 isl::space ScatterSpace = getScatterSpace(Schedule); 220 221 // { Scatter[] -> DomainWrite[] } 222 isl::union_map UMap = computeScalarReachingDefinition( 223 Schedule, isl::union_set(Writes), InclDef, InclRedef); 224 225 isl::space ResultSpace = ScatterSpace.map_from_domain_and_range(DomainSpace); 226 return singleton(UMap, ResultSpace); 227 } 228 229 isl::union_map polly::makeUnknownForDomain(isl::union_set Domain) { 230 return isl::union_map::from_domain(Domain); 231 } 232 233 /// Create a domain-to-unknown value mapping. 234 /// 235 /// @see makeUnknownForDomain(isl::union_set) 236 /// 237 /// @param Domain { Domain[] } 238 /// 239 /// @return { Domain[] -> ValInst[] } 240 static isl::map makeUnknownForDomain(isl::set Domain) { 241 return isl::map::from_domain(Domain); 242 } 243 244 /// Return whether @p Map maps to an unknown value. 245 /// 246 /// @param { [] -> ValInst[] } 247 static bool isMapToUnknown(const isl::map &Map) { 248 isl::space Space = Map.get_space().range(); 249 return Space.has_tuple_id(isl::dim::set).is_false() && 250 Space.is_wrapping().is_false() && Space.dim(isl::dim::set) == 0; 251 } 252 253 isl::union_map polly::filterKnownValInst(const isl::union_map &UMap) { 254 isl::union_map Result = isl::union_map::empty(UMap.get_space()); 255 for (isl::map Map : UMap.get_map_list()) { 256 if (!isMapToUnknown(Map)) 257 Result = Result.add_map(Map); 258 } 259 return Result; 260 } 261 262 ZoneAlgorithm::ZoneAlgorithm(const char *PassName, Scop *S, LoopInfo *LI) 263 : PassName(PassName), IslCtx(S->getSharedIslCtx()), S(S), LI(LI), 264 Schedule(S->getSchedule()) { 265 auto Domains = S->getDomains(); 266 267 Schedule = Schedule.intersect_domain(Domains); 268 ParamSpace = Schedule.get_space(); 269 ScatterSpace = getScatterSpace(Schedule); 270 } 271 272 /// Check if all stores in @p Stmt store the very same value. 273 /// 274 /// This covers a special situation occurring in Polybench's 275 /// covariance/correlation (which is typical for algorithms that cover symmetric 276 /// matrices): 277 /// 278 /// for (int i = 0; i < n; i += 1) 279 /// for (int j = 0; j <= i; j += 1) { 280 /// double x = ...; 281 /// C[i][j] = x; 282 /// C[j][i] = x; 283 /// } 284 /// 285 /// For i == j, the same value is written twice to the same element.Double 286 /// writes to the same element are not allowed in DeLICM because its algorithm 287 /// does not see which of the writes is effective.But if its the same value 288 /// anyway, it doesn't matter. 289 /// 290 /// LLVM passes, however, cannot simplify this because the write is necessary 291 /// for i != j (unless it would add a condition for one of the writes to occur 292 /// only if i != j). 293 /// 294 /// TODO: In the future we may want to extent this to make the checks 295 /// specific to different memory locations. 296 static bool onlySameValueWrites(ScopStmt *Stmt) { 297 Value *V = nullptr; 298 299 for (auto *MA : *Stmt) { 300 if (!MA->isLatestArrayKind() || !MA->isMustWrite() || 301 !MA->isOriginalArrayKind()) 302 continue; 303 304 if (!V) { 305 V = MA->getAccessValue(); 306 continue; 307 } 308 309 if (V != MA->getAccessValue()) 310 return false; 311 } 312 return true; 313 } 314 315 /// Is @p InnerLoop nested inside @p OuterLoop? 316 static bool isInsideLoop(Loop *OuterLoop, Loop *InnerLoop) { 317 // If OuterLoop is nullptr, we cannot call its contains() method. In this case 318 // OuterLoop represents the 'top level' and therefore contains all loop. 319 return !OuterLoop || OuterLoop->contains(InnerLoop); 320 } 321 322 void ZoneAlgorithm::collectIncompatibleElts(ScopStmt *Stmt, 323 isl::union_set &IncompatibleElts, 324 isl::union_set &AllElts) { 325 auto Stores = makeEmptyUnionMap(); 326 auto Loads = makeEmptyUnionMap(); 327 328 // This assumes that the MemoryKind::Array MemoryAccesses are iterated in 329 // order. 330 for (auto *MA : *Stmt) { 331 if (!MA->isOriginalArrayKind()) 332 continue; 333 334 isl::map AccRelMap = getAccessRelationFor(MA); 335 isl::union_map AccRel = AccRelMap; 336 337 // To avoid solving any ILP problems, always add entire arrays instead of 338 // just the elements that are accessed. 339 auto ArrayElts = isl::set::universe(AccRelMap.get_space().range()); 340 AllElts = AllElts.add_set(ArrayElts); 341 342 if (MA->isRead()) { 343 // Reject load after store to same location. 344 if (!Stores.is_disjoint(AccRel)) { 345 LLVM_DEBUG( 346 dbgs() << "Load after store of same element in same statement\n"); 347 OptimizationRemarkMissed R(PassName, "LoadAfterStore", 348 MA->getAccessInstruction()); 349 R << "load after store of same element in same statement"; 350 R << " (previous stores: " << Stores; 351 R << ", loading: " << AccRel << ")"; 352 S->getFunction().getContext().diagnose(R); 353 354 IncompatibleElts = IncompatibleElts.add_set(ArrayElts); 355 } 356 357 Loads = Loads.unite(AccRel); 358 359 continue; 360 } 361 362 // In region statements the order is less clear, eg. the load and store 363 // might be in a boxed loop. 364 if (Stmt->isRegionStmt() && !Loads.is_disjoint(AccRel)) { 365 LLVM_DEBUG(dbgs() << "WRITE in non-affine subregion not supported\n"); 366 OptimizationRemarkMissed R(PassName, "StoreInSubregion", 367 MA->getAccessInstruction()); 368 R << "store is in a non-affine subregion"; 369 S->getFunction().getContext().diagnose(R); 370 371 IncompatibleElts = IncompatibleElts.add_set(ArrayElts); 372 } 373 374 // Do not allow more than one store to the same location. 375 if (!Stores.is_disjoint(AccRel) && !onlySameValueWrites(Stmt)) { 376 LLVM_DEBUG(dbgs() << "WRITE after WRITE to same element\n"); 377 OptimizationRemarkMissed R(PassName, "StoreAfterStore", 378 MA->getAccessInstruction()); 379 R << "store after store of same element in same statement"; 380 R << " (previous stores: " << Stores; 381 R << ", storing: " << AccRel << ")"; 382 S->getFunction().getContext().diagnose(R); 383 384 IncompatibleElts = IncompatibleElts.add_set(ArrayElts); 385 } 386 387 Stores = Stores.unite(AccRel); 388 } 389 } 390 391 void ZoneAlgorithm::addArrayReadAccess(MemoryAccess *MA) { 392 assert(MA->isLatestArrayKind()); 393 assert(MA->isRead()); 394 ScopStmt *Stmt = MA->getStatement(); 395 396 // { DomainRead[] -> Element[] } 397 auto AccRel = intersectRange(getAccessRelationFor(MA), CompatibleElts); 398 AllReads = AllReads.add_map(AccRel); 399 400 if (LoadInst *Load = dyn_cast_or_null<LoadInst>(MA->getAccessInstruction())) { 401 // { DomainRead[] -> ValInst[] } 402 isl::map LoadValInst = makeValInst( 403 Load, Stmt, LI->getLoopFor(Load->getParent()), Stmt->isBlockStmt()); 404 405 // { DomainRead[] -> [Element[] -> DomainRead[]] } 406 isl::map IncludeElement = AccRel.domain_map().curry(); 407 408 // { [Element[] -> DomainRead[]] -> ValInst[] } 409 isl::map EltLoadValInst = LoadValInst.apply_domain(IncludeElement); 410 411 AllReadValInst = AllReadValInst.add_map(EltLoadValInst); 412 } 413 } 414 415 isl::union_map ZoneAlgorithm::getWrittenValue(MemoryAccess *MA, 416 isl::map AccRel) { 417 if (!MA->isMustWrite()) 418 return {}; 419 420 Value *AccVal = MA->getAccessValue(); 421 ScopStmt *Stmt = MA->getStatement(); 422 Instruction *AccInst = MA->getAccessInstruction(); 423 424 // Write a value to a single element. 425 auto L = MA->isOriginalArrayKind() ? LI->getLoopFor(AccInst->getParent()) 426 : Stmt->getSurroundingLoop(); 427 if (AccVal && 428 AccVal->getType() == MA->getLatestScopArrayInfo()->getElementType() && 429 AccRel.is_single_valued().is_true()) 430 return makeNormalizedValInst(AccVal, Stmt, L); 431 432 // memset(_, '0', ) is equivalent to writing the null value to all touched 433 // elements. isMustWrite() ensures that all of an element's bytes are 434 // overwritten. 435 if (auto *Memset = dyn_cast<MemSetInst>(AccInst)) { 436 auto *WrittenConstant = dyn_cast<Constant>(Memset->getValue()); 437 Type *Ty = MA->getLatestScopArrayInfo()->getElementType(); 438 if (WrittenConstant && WrittenConstant->isZeroValue()) { 439 Constant *Zero = Constant::getNullValue(Ty); 440 return makeNormalizedValInst(Zero, Stmt, L); 441 } 442 } 443 444 return {}; 445 } 446 447 void ZoneAlgorithm::addArrayWriteAccess(MemoryAccess *MA) { 448 assert(MA->isLatestArrayKind()); 449 assert(MA->isWrite()); 450 auto *Stmt = MA->getStatement(); 451 452 // { Domain[] -> Element[] } 453 isl::map AccRel = intersectRange(getAccessRelationFor(MA), CompatibleElts); 454 455 if (MA->isMustWrite()) 456 AllMustWrites = AllMustWrites.add_map(AccRel); 457 458 if (MA->isMayWrite()) 459 AllMayWrites = AllMayWrites.add_map(AccRel); 460 461 // { Domain[] -> ValInst[] } 462 isl::union_map WriteValInstance = getWrittenValue(MA, AccRel); 463 if (!WriteValInstance) 464 WriteValInstance = makeUnknownForDomain(Stmt); 465 466 // { Domain[] -> [Element[] -> Domain[]] } 467 isl::map IncludeElement = AccRel.domain_map().curry(); 468 469 // { [Element[] -> DomainWrite[]] -> ValInst[] } 470 isl::union_map EltWriteValInst = 471 WriteValInstance.apply_domain(IncludeElement); 472 473 AllWriteValInst = AllWriteValInst.unite(EltWriteValInst); 474 } 475 476 /// For an llvm::Value defined in @p DefStmt, compute the RAW dependency for a 477 /// use in every instance of @p UseStmt. 478 /// 479 /// @param UseStmt Statement a scalar is used in. 480 /// @param DefStmt Statement a scalar is defined in. 481 /// 482 /// @return { DomainUse[] -> DomainDef[] } 483 isl::map ZoneAlgorithm::computeUseToDefFlowDependency(ScopStmt *UseStmt, 484 ScopStmt *DefStmt) { 485 // { DomainUse[] -> Scatter[] } 486 isl::map UseScatter = getScatterFor(UseStmt); 487 488 // { Zone[] -> DomainDef[] } 489 isl::map ReachDefZone = getScalarReachingDefinition(DefStmt); 490 491 // { Scatter[] -> DomainDef[] } 492 isl::map ReachDefTimepoints = 493 convertZoneToTimepoints(ReachDefZone, isl::dim::in, false, true); 494 495 // { DomainUse[] -> DomainDef[] } 496 return UseScatter.apply_range(ReachDefTimepoints); 497 } 498 499 /// Return whether @p PHI refers (also transitively through other PHIs) to 500 /// itself. 501 /// 502 /// loop: 503 /// %phi1 = phi [0, %preheader], [%phi1, %loop] 504 /// br i1 %c, label %loop, label %exit 505 /// 506 /// exit: 507 /// %phi2 = phi [%phi1, %bb] 508 /// 509 /// In this example, %phi1 is recursive, but %phi2 is not. 510 static bool isRecursivePHI(const PHINode *PHI) { 511 SmallVector<const PHINode *, 8> Worklist; 512 SmallPtrSet<const PHINode *, 8> Visited; 513 Worklist.push_back(PHI); 514 515 while (!Worklist.empty()) { 516 const PHINode *Cur = Worklist.pop_back_val(); 517 518 if (Visited.count(Cur)) 519 continue; 520 Visited.insert(Cur); 521 522 for (const Use &Incoming : Cur->incoming_values()) { 523 Value *IncomingVal = Incoming.get(); 524 auto *IncomingPHI = dyn_cast<PHINode>(IncomingVal); 525 if (!IncomingPHI) 526 continue; 527 528 if (IncomingPHI == PHI) 529 return true; 530 Worklist.push_back(IncomingPHI); 531 } 532 } 533 return false; 534 } 535 536 isl::union_map ZoneAlgorithm::computePerPHI(const ScopArrayInfo *SAI) { 537 // TODO: If the PHI has an incoming block from before the SCoP, it is not 538 // represented in any ScopStmt. 539 540 auto *PHI = cast<PHINode>(SAI->getBasePtr()); 541 auto It = PerPHIMaps.find(PHI); 542 if (It != PerPHIMaps.end()) 543 return It->second; 544 545 assert(SAI->isPHIKind()); 546 547 // { DomainPHIWrite[] -> Scatter[] } 548 isl::union_map PHIWriteScatter = makeEmptyUnionMap(); 549 550 // Collect all incoming block timepoints. 551 for (MemoryAccess *MA : S->getPHIIncomings(SAI)) { 552 isl::map Scatter = getScatterFor(MA); 553 PHIWriteScatter = PHIWriteScatter.add_map(Scatter); 554 } 555 556 // { DomainPHIRead[] -> Scatter[] } 557 isl::map PHIReadScatter = getScatterFor(S->getPHIRead(SAI)); 558 559 // { DomainPHIRead[] -> Scatter[] } 560 isl::map BeforeRead = beforeScatter(PHIReadScatter, true); 561 562 // { Scatter[] } 563 isl::set WriteTimes = singleton(PHIWriteScatter.range(), ScatterSpace); 564 565 // { DomainPHIRead[] -> Scatter[] } 566 isl::map PHIWriteTimes = BeforeRead.intersect_range(WriteTimes); 567 isl::map LastPerPHIWrites = PHIWriteTimes.lexmax(); 568 569 // { DomainPHIRead[] -> DomainPHIWrite[] } 570 isl::union_map Result = 571 isl::union_map(LastPerPHIWrites).apply_range(PHIWriteScatter.reverse()); 572 assert(!Result.is_single_valued().is_false()); 573 assert(!Result.is_injective().is_false()); 574 575 PerPHIMaps.insert({PHI, Result}); 576 return Result; 577 } 578 579 isl::union_set ZoneAlgorithm::makeEmptyUnionSet() const { 580 return isl::union_set::empty(ParamSpace); 581 } 582 583 isl::union_map ZoneAlgorithm::makeEmptyUnionMap() const { 584 return isl::union_map::empty(ParamSpace); 585 } 586 587 void ZoneAlgorithm::collectCompatibleElts() { 588 // First find all the incompatible elements, then take the complement. 589 // We compile the list of compatible (rather than incompatible) elements so 590 // users can intersect with the list, not requiring a subtract operation. It 591 // also allows us to define a 'universe' of all elements and makes it more 592 // explicit in which array elements can be used. 593 isl::union_set AllElts = makeEmptyUnionSet(); 594 isl::union_set IncompatibleElts = makeEmptyUnionSet(); 595 596 for (auto &Stmt : *S) 597 collectIncompatibleElts(&Stmt, IncompatibleElts, AllElts); 598 599 NumIncompatibleArrays += isl_union_set_n_set(IncompatibleElts.get()); 600 CompatibleElts = AllElts.subtract(IncompatibleElts); 601 NumCompatibleArrays += isl_union_set_n_set(CompatibleElts.get()); 602 } 603 604 isl::map ZoneAlgorithm::getScatterFor(ScopStmt *Stmt) const { 605 isl::space ResultSpace = 606 Stmt->getDomainSpace().map_from_domain_and_range(ScatterSpace); 607 return Schedule.extract_map(ResultSpace); 608 } 609 610 isl::map ZoneAlgorithm::getScatterFor(MemoryAccess *MA) const { 611 return getScatterFor(MA->getStatement()); 612 } 613 614 isl::union_map ZoneAlgorithm::getScatterFor(isl::union_set Domain) const { 615 return Schedule.intersect_domain(Domain); 616 } 617 618 isl::map ZoneAlgorithm::getScatterFor(isl::set Domain) const { 619 auto ResultSpace = Domain.get_space().map_from_domain_and_range(ScatterSpace); 620 auto UDomain = isl::union_set(Domain); 621 auto UResult = getScatterFor(std::move(UDomain)); 622 auto Result = singleton(std::move(UResult), std::move(ResultSpace)); 623 assert(!Result || Result.domain().is_equal(Domain) == isl_bool_true); 624 return Result; 625 } 626 627 isl::set ZoneAlgorithm::getDomainFor(ScopStmt *Stmt) const { 628 return Stmt->getDomain().remove_redundancies(); 629 } 630 631 isl::set ZoneAlgorithm::getDomainFor(MemoryAccess *MA) const { 632 return getDomainFor(MA->getStatement()); 633 } 634 635 isl::map ZoneAlgorithm::getAccessRelationFor(MemoryAccess *MA) const { 636 auto Domain = getDomainFor(MA); 637 auto AccRel = MA->getLatestAccessRelation(); 638 return AccRel.intersect_domain(Domain); 639 } 640 641 isl::map ZoneAlgorithm::getDefToTarget(ScopStmt *DefStmt, 642 ScopStmt *TargetStmt) { 643 // No translation required if the definition is already at the target. 644 if (TargetStmt == DefStmt) 645 return isl::map::identity( 646 getDomainFor(TargetStmt).get_space().map_from_set()); 647 648 isl::map &Result = DefToTargetCache[std::make_pair(TargetStmt, DefStmt)]; 649 650 // This is a shortcut in case the schedule is still the original and 651 // TargetStmt is in the same or nested inside DefStmt's loop. With the 652 // additional assumption that operand trees do not cross DefStmt's loop 653 // header, then TargetStmt's instance shared coordinates are the same as 654 // DefStmt's coordinates. All TargetStmt instances with this prefix share 655 // the same DefStmt instance. 656 // Model: 657 // 658 // for (int i < 0; i < N; i+=1) { 659 // DefStmt: 660 // D = ...; 661 // for (int j < 0; j < N; j+=1) { 662 // TargetStmt: 663 // use(D); 664 // } 665 // } 666 // 667 // Here, the value used in TargetStmt is defined in the corresponding 668 // DefStmt, i.e. 669 // 670 // { DefStmt[i] -> TargetStmt[i,j] } 671 // 672 // In practice, this should cover the majority of cases. 673 if (!Result && S->isOriginalSchedule() && 674 isInsideLoop(DefStmt->getSurroundingLoop(), 675 TargetStmt->getSurroundingLoop())) { 676 isl::set DefDomain = getDomainFor(DefStmt); 677 isl::set TargetDomain = getDomainFor(TargetStmt); 678 assert(DefDomain.dim(isl::dim::set) <= TargetDomain.dim(isl::dim::set)); 679 680 Result = isl::map::from_domain_and_range(DefDomain, TargetDomain); 681 for (unsigned i = 0, DefDims = DefDomain.dim(isl::dim::set); i < DefDims; 682 i += 1) 683 Result = Result.equate(isl::dim::in, i, isl::dim::out, i); 684 } 685 686 if (!Result) { 687 // { DomainDef[] -> DomainTarget[] } 688 Result = computeUseToDefFlowDependency(TargetStmt, DefStmt).reverse(); 689 simplify(Result); 690 } 691 692 return Result; 693 } 694 695 isl::map ZoneAlgorithm::getScalarReachingDefinition(ScopStmt *Stmt) { 696 auto &Result = ScalarReachDefZone[Stmt]; 697 if (Result) 698 return Result; 699 700 auto Domain = getDomainFor(Stmt); 701 Result = computeScalarReachingDefinition(Schedule, Domain, false, true); 702 simplify(Result); 703 704 return Result; 705 } 706 707 isl::map ZoneAlgorithm::getScalarReachingDefinition(isl::set DomainDef) { 708 auto DomId = DomainDef.get_tuple_id(); 709 auto *Stmt = static_cast<ScopStmt *>(isl_id_get_user(DomId.get())); 710 711 auto StmtResult = getScalarReachingDefinition(Stmt); 712 713 return StmtResult.intersect_range(DomainDef); 714 } 715 716 isl::map ZoneAlgorithm::makeUnknownForDomain(ScopStmt *Stmt) const { 717 return ::makeUnknownForDomain(getDomainFor(Stmt)); 718 } 719 720 isl::id ZoneAlgorithm::makeValueId(Value *V) { 721 if (!V) 722 return nullptr; 723 724 auto &Id = ValueIds[V]; 725 if (Id.is_null()) { 726 auto Name = getIslCompatibleName("Val_", V, ValueIds.size() - 1, 727 std::string(), UseInstructionNames); 728 Id = isl::id::alloc(IslCtx.get(), Name.c_str(), V); 729 } 730 return Id; 731 } 732 733 isl::space ZoneAlgorithm::makeValueSpace(Value *V) { 734 auto Result = ParamSpace.set_from_params(); 735 return Result.set_tuple_id(isl::dim::set, makeValueId(V)); 736 } 737 738 isl::set ZoneAlgorithm::makeValueSet(Value *V) { 739 auto Space = makeValueSpace(V); 740 return isl::set::universe(Space); 741 } 742 743 isl::map ZoneAlgorithm::makeValInst(Value *Val, ScopStmt *UserStmt, Loop *Scope, 744 bool IsCertain) { 745 // If the definition/write is conditional, the value at the location could 746 // be either the written value or the old value. Since we cannot know which 747 // one, consider the value to be unknown. 748 if (!IsCertain) 749 return makeUnknownForDomain(UserStmt); 750 751 auto DomainUse = getDomainFor(UserStmt); 752 auto VUse = VirtualUse::create(S, UserStmt, Scope, Val, true); 753 switch (VUse.getKind()) { 754 case VirtualUse::Constant: 755 case VirtualUse::Block: 756 case VirtualUse::Hoisted: 757 case VirtualUse::ReadOnly: { 758 // The definition does not depend on the statement which uses it. 759 auto ValSet = makeValueSet(Val); 760 return isl::map::from_domain_and_range(DomainUse, ValSet); 761 } 762 763 case VirtualUse::Synthesizable: { 764 auto *ScevExpr = VUse.getScevExpr(); 765 auto UseDomainSpace = DomainUse.get_space(); 766 767 // Construct the SCEV space. 768 // TODO: Add only the induction variables referenced in SCEVAddRecExpr 769 // expressions, not just all of them. 770 auto ScevId = isl::manage(isl_id_alloc( 771 UseDomainSpace.get_ctx().get(), nullptr, const_cast<SCEV *>(ScevExpr))); 772 773 auto ScevSpace = UseDomainSpace.drop_dims(isl::dim::set, 0, 0); 774 ScevSpace = ScevSpace.set_tuple_id(isl::dim::set, ScevId); 775 776 // { DomainUse[] -> ScevExpr[] } 777 auto ValInst = 778 isl::map::identity(UseDomainSpace.map_from_domain_and_range(ScevSpace)); 779 return ValInst; 780 } 781 782 case VirtualUse::Intra: { 783 // Definition and use is in the same statement. We do not need to compute 784 // a reaching definition. 785 786 // { llvm::Value } 787 auto ValSet = makeValueSet(Val); 788 789 // { UserDomain[] -> llvm::Value } 790 auto ValInstSet = isl::map::from_domain_and_range(DomainUse, ValSet); 791 792 // { UserDomain[] -> [UserDomain[] - >llvm::Value] } 793 auto Result = ValInstSet.domain_map().reverse(); 794 simplify(Result); 795 return Result; 796 } 797 798 case VirtualUse::Inter: { 799 // The value is defined in a different statement. 800 801 auto *Inst = cast<Instruction>(Val); 802 auto *ValStmt = S->getStmtFor(Inst); 803 804 // If the llvm::Value is defined in a removed Stmt, we cannot derive its 805 // domain. We could use an arbitrary statement, but this could result in 806 // different ValInst[] for the same llvm::Value. 807 if (!ValStmt) 808 return ::makeUnknownForDomain(DomainUse); 809 810 // { DomainUse[] -> DomainDef[] } 811 auto UsedInstance = getDefToTarget(ValStmt, UserStmt).reverse(); 812 813 // { llvm::Value } 814 auto ValSet = makeValueSet(Val); 815 816 // { DomainUse[] -> llvm::Value[] } 817 auto ValInstSet = isl::map::from_domain_and_range(DomainUse, ValSet); 818 819 // { DomainUse[] -> [DomainDef[] -> llvm::Value] } 820 auto Result = UsedInstance.range_product(ValInstSet); 821 822 simplify(Result); 823 return Result; 824 } 825 } 826 llvm_unreachable("Unhandled use type"); 827 } 828 829 /// Remove all computed PHIs out of @p Input and replace by their incoming 830 /// value. 831 /// 832 /// @param Input { [] -> ValInst[] } 833 /// @param ComputedPHIs Set of PHIs that are replaced. Its ValInst must appear 834 /// on the LHS of @p NormalizeMap. 835 /// @param NormalizeMap { ValInst[] -> ValInst[] } 836 static isl::union_map normalizeValInst(isl::union_map Input, 837 const DenseSet<PHINode *> &ComputedPHIs, 838 isl::union_map NormalizeMap) { 839 isl::union_map Result = isl::union_map::empty(Input.get_space()); 840 for (isl::map Map : Input.get_map_list()) { 841 isl::space Space = Map.get_space(); 842 isl::space RangeSpace = Space.range(); 843 844 // Instructions within the SCoP are always wrapped. Non-wrapped tuples 845 // are therefore invariant in the SCoP and don't need normalization. 846 if (!RangeSpace.is_wrapping()) { 847 Result = Result.add_map(Map); 848 continue; 849 } 850 851 auto *PHI = dyn_cast<PHINode>(static_cast<Value *>( 852 RangeSpace.unwrap().get_tuple_id(isl::dim::out).get_user())); 853 854 // If no normalization is necessary, then the ValInst stands for itself. 855 if (!ComputedPHIs.count(PHI)) { 856 Result = Result.add_map(Map); 857 continue; 858 } 859 860 // Otherwise, apply the normalization. 861 isl::union_map Mapped = isl::union_map(Map).apply_range(NormalizeMap); 862 Result = Result.unite(Mapped); 863 NumPHINormialization++; 864 } 865 return Result; 866 } 867 868 isl::union_map ZoneAlgorithm::makeNormalizedValInst(llvm::Value *Val, 869 ScopStmt *UserStmt, 870 llvm::Loop *Scope, 871 bool IsCertain) { 872 isl::map ValInst = makeValInst(Val, UserStmt, Scope, IsCertain); 873 isl::union_map Normalized = 874 normalizeValInst(ValInst, ComputedPHIs, NormalizeMap); 875 return Normalized; 876 } 877 878 bool ZoneAlgorithm::isCompatibleAccess(MemoryAccess *MA) { 879 if (!MA) 880 return false; 881 if (!MA->isLatestArrayKind()) 882 return false; 883 Instruction *AccInst = MA->getAccessInstruction(); 884 return isa<StoreInst>(AccInst) || isa<LoadInst>(AccInst); 885 } 886 887 bool ZoneAlgorithm::isNormalizable(MemoryAccess *MA) { 888 assert(MA->isRead()); 889 890 // Exclude ExitPHIs, we are assuming that a normalizable PHI has a READ 891 // MemoryAccess. 892 if (!MA->isOriginalPHIKind()) 893 return false; 894 895 // Exclude recursive PHIs, normalizing them would require a transitive 896 // closure. 897 auto *PHI = cast<PHINode>(MA->getAccessInstruction()); 898 if (RecursivePHIs.count(PHI)) 899 return false; 900 901 // Ensure that each incoming value can be represented by a ValInst[]. 902 // We do represent values from statements associated to multiple incoming 903 // value by the PHI itself, but we do not handle this case yet (especially 904 // isNormalized()) when normalizing. 905 const ScopArrayInfo *SAI = MA->getOriginalScopArrayInfo(); 906 auto Incomings = S->getPHIIncomings(SAI); 907 for (MemoryAccess *Incoming : Incomings) { 908 if (Incoming->getIncoming().size() != 1) 909 return false; 910 } 911 912 return true; 913 } 914 915 isl::boolean ZoneAlgorithm::isNormalized(isl::map Map) { 916 isl::space Space = Map.get_space(); 917 isl::space RangeSpace = Space.range(); 918 919 isl::boolean IsWrapping = RangeSpace.is_wrapping(); 920 if (!IsWrapping.is_true()) 921 return !IsWrapping; 922 isl::space Unwrapped = RangeSpace.unwrap(); 923 924 isl::id OutTupleId = Unwrapped.get_tuple_id(isl::dim::out); 925 if (OutTupleId.is_null()) 926 return isl::boolean(); 927 auto *PHI = dyn_cast<PHINode>(static_cast<Value *>(OutTupleId.get_user())); 928 if (!PHI) 929 return true; 930 931 isl::id InTupleId = Unwrapped.get_tuple_id(isl::dim::in); 932 if (OutTupleId.is_null()) 933 return isl::boolean(); 934 auto *IncomingStmt = static_cast<ScopStmt *>(InTupleId.get_user()); 935 MemoryAccess *PHIRead = IncomingStmt->lookupPHIReadOf(PHI); 936 if (!isNormalizable(PHIRead)) 937 return true; 938 939 return false; 940 } 941 942 isl::boolean ZoneAlgorithm::isNormalized(isl::union_map UMap) { 943 isl::boolean Result = true; 944 for (isl::map Map : UMap.get_map_list()) { 945 Result = isNormalized(Map); 946 if (Result.is_true()) 947 continue; 948 break; 949 } 950 return Result; 951 } 952 953 void ZoneAlgorithm::computeCommon() { 954 AllReads = makeEmptyUnionMap(); 955 AllMayWrites = makeEmptyUnionMap(); 956 AllMustWrites = makeEmptyUnionMap(); 957 AllWriteValInst = makeEmptyUnionMap(); 958 AllReadValInst = makeEmptyUnionMap(); 959 960 // Default to empty, i.e. no normalization/replacement is taking place. Call 961 // computeNormalizedPHIs() to initialize. 962 NormalizeMap = makeEmptyUnionMap(); 963 ComputedPHIs.clear(); 964 965 for (auto &Stmt : *S) { 966 for (auto *MA : Stmt) { 967 if (!MA->isLatestArrayKind()) 968 continue; 969 970 if (MA->isRead()) 971 addArrayReadAccess(MA); 972 973 if (MA->isWrite()) 974 addArrayWriteAccess(MA); 975 } 976 } 977 978 // { DomainWrite[] -> Element[] } 979 AllWrites = AllMustWrites.unite(AllMayWrites); 980 981 // { [Element[] -> Zone[]] -> DomainWrite[] } 982 WriteReachDefZone = 983 computeReachingDefinition(Schedule, AllWrites, false, true); 984 simplify(WriteReachDefZone); 985 } 986 987 void ZoneAlgorithm::computeNormalizedPHIs() { 988 // Determine which PHIs can reference themselves. They are excluded from 989 // normalization to avoid problems with transitive closures. 990 for (ScopStmt &Stmt : *S) { 991 for (MemoryAccess *MA : Stmt) { 992 if (!MA->isPHIKind()) 993 continue; 994 if (!MA->isRead()) 995 continue; 996 997 // TODO: Can be more efficient since isRecursivePHI can theoretically 998 // determine recursiveness for multiple values and/or cache results. 999 auto *PHI = cast<PHINode>(MA->getAccessInstruction()); 1000 if (isRecursivePHI(PHI)) { 1001 NumRecursivePHIs++; 1002 RecursivePHIs.insert(PHI); 1003 } 1004 } 1005 } 1006 1007 // { PHIValInst[] -> IncomingValInst[] } 1008 isl::union_map AllPHIMaps = makeEmptyUnionMap(); 1009 1010 // Discover new PHIs and try to normalize them. 1011 DenseSet<PHINode *> AllPHIs; 1012 for (ScopStmt &Stmt : *S) { 1013 for (MemoryAccess *MA : Stmt) { 1014 if (!MA->isOriginalPHIKind()) 1015 continue; 1016 if (!MA->isRead()) 1017 continue; 1018 if (!isNormalizable(MA)) 1019 continue; 1020 1021 auto *PHI = cast<PHINode>(MA->getAccessInstruction()); 1022 const ScopArrayInfo *SAI = MA->getOriginalScopArrayInfo(); 1023 1024 // { PHIDomain[] -> PHIValInst[] } 1025 isl::map PHIValInst = makeValInst(PHI, &Stmt, Stmt.getSurroundingLoop()); 1026 1027 // { IncomingDomain[] -> IncomingValInst[] } 1028 isl::union_map IncomingValInsts = makeEmptyUnionMap(); 1029 1030 // Get all incoming values. 1031 for (MemoryAccess *MA : S->getPHIIncomings(SAI)) { 1032 ScopStmt *IncomingStmt = MA->getStatement(); 1033 1034 auto Incoming = MA->getIncoming(); 1035 assert(Incoming.size() == 1 && "The incoming value must be " 1036 "representable by something else than " 1037 "the PHI itself"); 1038 Value *IncomingVal = Incoming[0].second; 1039 1040 // { IncomingDomain[] -> IncomingValInst[] } 1041 isl::map IncomingValInst = makeValInst( 1042 IncomingVal, IncomingStmt, IncomingStmt->getSurroundingLoop()); 1043 1044 IncomingValInsts = IncomingValInsts.add_map(IncomingValInst); 1045 } 1046 1047 // Determine which instance of the PHI statement corresponds to which 1048 // incoming value. 1049 // { PHIDomain[] -> IncomingDomain[] } 1050 isl::union_map PerPHI = computePerPHI(SAI); 1051 1052 // { PHIValInst[] -> IncomingValInst[] } 1053 isl::union_map PHIMap = 1054 PerPHI.apply_domain(PHIValInst).apply_range(IncomingValInsts); 1055 assert(!PHIMap.is_single_valued().is_false()); 1056 1057 // Resolve transitiveness: The incoming value of the newly discovered PHI 1058 // may reference a previously normalized PHI. At the same time, already 1059 // normalized PHIs might be normalized to the new PHI. At the end, none of 1060 // the PHIs may appear on the right-hand-side of the normalization map. 1061 PHIMap = normalizeValInst(PHIMap, AllPHIs, AllPHIMaps); 1062 AllPHIs.insert(PHI); 1063 AllPHIMaps = normalizeValInst(AllPHIMaps, AllPHIs, PHIMap); 1064 1065 AllPHIMaps = AllPHIMaps.unite(PHIMap); 1066 NumNormalizablePHIs++; 1067 } 1068 } 1069 simplify(AllPHIMaps); 1070 1071 // Apply the normalization. 1072 ComputedPHIs = AllPHIs; 1073 NormalizeMap = AllPHIMaps; 1074 1075 assert(!NormalizeMap || isNormalized(NormalizeMap)); 1076 } 1077 1078 void ZoneAlgorithm::printAccesses(llvm::raw_ostream &OS, int Indent) const { 1079 OS.indent(Indent) << "After accesses {\n"; 1080 for (auto &Stmt : *S) { 1081 OS.indent(Indent + 4) << Stmt.getBaseName() << "\n"; 1082 for (auto *MA : Stmt) 1083 MA->print(OS); 1084 } 1085 OS.indent(Indent) << "}\n"; 1086 } 1087 1088 isl::union_map ZoneAlgorithm::computeKnownFromMustWrites() const { 1089 // { [Element[] -> Zone[]] -> [Element[] -> DomainWrite[]] } 1090 isl::union_map EltReachdDef = distributeDomain(WriteReachDefZone.curry()); 1091 1092 // { [Element[] -> DomainWrite[]] -> ValInst[] } 1093 isl::union_map AllKnownWriteValInst = filterKnownValInst(AllWriteValInst); 1094 1095 // { [Element[] -> Zone[]] -> ValInst[] } 1096 return EltReachdDef.apply_range(AllKnownWriteValInst); 1097 } 1098 1099 isl::union_map ZoneAlgorithm::computeKnownFromLoad() const { 1100 // { Element[] } 1101 isl::union_set AllAccessedElts = AllReads.range().unite(AllWrites.range()); 1102 1103 // { Element[] -> Scatter[] } 1104 isl::union_map EltZoneUniverse = isl::union_map::from_domain_and_range( 1105 AllAccessedElts, isl::set::universe(ScatterSpace)); 1106 1107 // This assumes there are no "holes" in 1108 // isl_union_map_domain(WriteReachDefZone); alternatively, compute the zone 1109 // before the first write or that are not written at all. 1110 // { Element[] -> Scatter[] } 1111 isl::union_set NonReachDef = 1112 EltZoneUniverse.wrap().subtract(WriteReachDefZone.domain()); 1113 1114 // { [Element[] -> Zone[]] -> ReachDefId[] } 1115 isl::union_map DefZone = 1116 WriteReachDefZone.unite(isl::union_map::from_domain(NonReachDef)); 1117 1118 // { [Element[] -> Scatter[]] -> Element[] } 1119 isl::union_map EltZoneElt = EltZoneUniverse.domain_map(); 1120 1121 // { [Element[] -> Zone[]] -> [Element[] -> ReachDefId[]] } 1122 isl::union_map DefZoneEltDefId = EltZoneElt.range_product(DefZone); 1123 1124 // { Element[] -> [Zone[] -> ReachDefId[]] } 1125 isl::union_map EltDefZone = DefZone.curry(); 1126 1127 // { [Element[] -> Zone[] -> [Element[] -> ReachDefId[]] } 1128 isl::union_map EltZoneEltDefid = distributeDomain(EltDefZone); 1129 1130 // { [Element[] -> Scatter[]] -> DomainRead[] } 1131 isl::union_map Reads = AllReads.range_product(Schedule).reverse(); 1132 1133 // { [Element[] -> Scatter[]] -> [Element[] -> DomainRead[]] } 1134 isl::union_map ReadsElt = EltZoneElt.range_product(Reads); 1135 1136 // { [Element[] -> Scatter[]] -> ValInst[] } 1137 isl::union_map ScatterKnown = ReadsElt.apply_range(AllReadValInst); 1138 1139 // { [Element[] -> ReachDefId[]] -> ValInst[] } 1140 isl::union_map DefidKnown = 1141 DefZoneEltDefId.apply_domain(ScatterKnown).reverse(); 1142 1143 // { [Element[] -> Zone[]] -> ValInst[] } 1144 return DefZoneEltDefId.apply_range(DefidKnown); 1145 } 1146 1147 isl::union_map ZoneAlgorithm::computeKnown(bool FromWrite, 1148 bool FromRead) const { 1149 isl::union_map Result = makeEmptyUnionMap(); 1150 1151 if (FromWrite) 1152 Result = Result.unite(computeKnownFromMustWrites()); 1153 1154 if (FromRead) 1155 Result = Result.unite(computeKnownFromLoad()); 1156 1157 simplify(Result); 1158 return Result; 1159 } 1160