- Notifications
You must be signed in to change notification settings - Fork14.5k
[NFC][flang] Added engineering option for triaging local-alloc-tbaa.#149587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
I triaged a benchmark that showed inaccurate results, when local-alloc-tbaawas enabled. It turned out to be not a real TBAA issue, but ratherTBAA affecting optimizations that affect FMA generation, which introducedan expected accuracy variation. I would like to keep this thresholdcontrol for future uses.
@llvm/pr-subscribers-flang-fir-hlfir Author: Slava Zakharin (vzakhari) ChangesI triaged a benchmark that showed inaccurate results, when local-alloc-tbaa Full diff:https://github.com/llvm/llvm-project/pull/149587.diff 2 Files Affected:
diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cppindex b27c1b26dedb3..85403ad257657 100644--- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp+++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp@@ -48,12 +48,21 @@ static llvm::cl::opt<bool> llvm::cl::Hidden, llvm::cl::desc("Add TBAA tags to local allocations."));+// Engineering option to triage TBAA tags attachment for accesses+// of allocatable entities.+static llvm::cl::opt<unsigned> localAllocsThreshold(+ "local-alloc-tbaa-threshold", llvm::cl::init(0), llvm::cl::ReallyHidden,+ llvm::cl::desc("If present, stops generating TBAA tags for accesses of "+ "local allocations after N accesses in a module"));+ namespace { /// Shared state per-module class PassState { public:- PassState(mlir::DominanceInfo &domInfo) : domInfo(domInfo) {}+ PassState(mlir::DominanceInfo &domInfo,+ std::optional<unsigned> localAllocsThreshold)+ : domInfo(domInfo), localAllocsThreshold(localAllocsThreshold) {} /// memoised call to fir::AliasAnalysis::getSource inline const fir::AliasAnalysis::Source &getSource(mlir::Value value) { if (!analysisCache.contains(value))@@ -84,6 +93,11 @@ class PassState { // (e.g. !fir.ref<!fir.type<Derived{f:!fir.box<!fir.heap<f32>>}>>). bool typeReferencesDescriptor(mlir::Type type);+ // Returns true if we can attach a TBAA tag to an access of an allocatable+ // entities. It checks if localAllocsThreshold allows the next tag+ // attachment.+ bool attachLocalAllocTag();+ private: mlir::DominanceInfo &domInfo; fir::AliasAnalysis analysis;@@ -103,6 +117,8 @@ class PassState { // Local pass cache for derived types that contain descriptor // member(s), to avoid the cost of isRecordWithDescriptorMember(). llvm::DenseSet<mlir::Type> typesContainingDescriptors;++ std::optional<unsigned> localAllocsThreshold; }; // Process fir.dummy_scope operations in the given func:@@ -169,6 +185,19 @@ bool PassState::typeReferencesDescriptor(mlir::Type type) { return false; }+bool PassState::attachLocalAllocTag() {+ if (!localAllocsThreshold)+ return true;+ if (*localAllocsThreshold == 0) {+ LLVM_DEBUG(llvm::dbgs().indent(2)+ << "WARN: not assigning TBAA tag for an allocated entity access "+ "due to the threshold\n");+ return false;+ }+ --*localAllocsThreshold;+ return true;+}+ class AddAliasTagsPass : public fir::impl::AddAliasTagsBase<AddAliasTagsPass> { public: void runOnOperation() override;@@ -335,16 +364,16 @@ void AddAliasTagsPass::runOnAliasInterface(fir::FirAliasTagOpInterface op, LLVM_DEBUG(llvm::dbgs().indent(2) << "WARN: unknown defining op for SourceKind::Allocate " << *op << "\n");- } else if (source.isPointer()) {+ } else if (source.isPointer() && state.attachLocalAllocTag()) { LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to allocation at " << *op << "\n"); tag = state.getFuncTreeWithScope(func, scopeOp).targetDataTree.getTag();- } else if (name) {+ } else if (name && state.attachLocalAllocTag()) { LLVM_DEBUG(llvm::dbgs().indent(2) << "Found reference to allocation " << name << " at " << *op << "\n"); tag = state.getFuncTreeWithScope(func, scopeOp) .allocatedDataTree.getTag(*name);- } else {+ } else if (state.attachLocalAllocTag()) { LLVM_DEBUG(llvm::dbgs().indent(2) << "WARN: couldn't find a name for allocation " << *op << "\n");@@ -372,7 +401,9 @@ void AddAliasTagsPass::runOnOperation() { // thinks the pass operates on), then the real work of the pass is done in // runOnAliasInterface auto &domInfo = getAnalysis<mlir::DominanceInfo>();- PassState state(domInfo);+ PassState state(domInfo, localAllocsThreshold.getPosition()+ ? std::optional<unsigned>(localAllocsThreshold)+ : std::nullopt); mlir::ModuleOp mod = getOperation(); mod.walk(diff --git a/flang/test/Transforms/tbaa-local-alloc-threshold.fir b/flang/test/Transforms/tbaa-local-alloc-threshold.firnew file mode 100644index 0000000000000..27c19a6e23095--- /dev/null+++ b/flang/test/Transforms/tbaa-local-alloc-threshold.fir@@ -0,0 +1,23 @@+// Check that -local-alloc-tbaa-threshold option limits+// the attachment of TBAA tags to accesses of locally allocated entities.+// RUN: fir-opt --fir-add-alias-tags -local-alloc-tbaa-threshold=2 %s | FileCheck %s --check-prefixes=ALL,COUNT2+// RUN: fir-opt --fir-add-alias-tags -local-alloc-tbaa-threshold=1 %s | FileCheck %s --check-prefixes=ALL,COUNT1+// RUN: fir-opt --fir-add-alias-tags -local-alloc-tbaa-threshold=0 %s | FileCheck %s --check-prefixes=ALL,COUNT0++// ALL-LABEL: func.func @_QPtest() {+// COUNT2: fir.load{{.*}}{tbaa =+// COUNT2: fir.store{{.*}}{tbaa =+// COUNT1: fir.load{{.*}}{tbaa =+// COUNT1-NOT: fir.store{{.*}}{tbaa =+// COUNT0-NOT: fir.load{{.*}}{tbaa =+// COUNT0-NOT: fir.store{{.*}}{tbaa =+func.func @_QPtest() {+ %0 = fir.dummy_scope : !fir.dscope+ %1 = fir.alloca f32 {bindc_name = "x", uniq_name = "_QFtestEx"}+ %2 = fir.declare %1 {uniq_name = "_QFtestEx"} : (!fir.ref<f32>) -> !fir.ref<f32>+ %3 = fir.alloca f32 {bindc_name = "y", uniq_name = "_QFtestEy"}+ %4 = fir.declare %3 {uniq_name = "_QFtestEy"} : (!fir.ref<f32>) -> !fir.ref<f32>+ %5 = fir.load %4 : !fir.ref<f32>+ fir.store %5 to %2 : !fir.ref<f32>+ return+} |
I triaged a benchmark that showed inaccurate results, when local-alloc-tbaa
was enabled. It turned out to be not a real TBAA issue, but rather
TBAA affecting optimizations that affect FMA generation, which introduced
an expected accuracy variation. I would like to keep this threshold
control for future uses.