Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Split ir nodes into multiple translation units#5596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
wujingyue wants to merge1 commit intomain
base:main
Choose a base branch
Loading
fromwjy/files

Conversation

@wujingyue
Copy link
Collaborator

  • split monolithic csrc/ir/nodes.cpp into internal_nodes.cpp and internal_base_nodes.cpp matching header declarations
  • cleaned includes with IWYU-style reductions and added required CUDA/runtime headers where needed
  • updated build and verified via _bn

greptile-apps[bot] reacted with thumbs up emoji
@github-actions
Copy link

Description

  • Split monolithic csrc/ir/nodes.cpp into internal_nodes.cpp and internal_base_nodes.cpp for better code organization

  • Moved IterDomainBuilder, IterDomain, and TensorDomain implementations to new internal_base_nodes.cpp file

  • Cleaned up includes with IWYU-style reductions and added required CUDA/runtime headers where needed

  • Updated CMakeLists.txt to compile the new split translation units

Changes walkthrough

Relevant files
Enhancement
internal_nodes.cpp
Cleaned up internal nodes implementation file                       

csrc/ir/internal_nodes.cpp

  • Split from original nodes.cpp with cleaned includes
  • Added ATen and CUDA headers as needed
  • Removed unused includes like device_lower/lower2device.h
  • Updated to include internal_nodes.h instead of interface_nodes.h
  • +10/-1695
    internal_base_nodes.cpp
    New base nodes implementation file                                             

    csrc/ir/internal_base_nodes.cpp

  • New file containing IterDomainBuilder, IterDomain, and TensorDomain
    implementations
  • Moved base node classes from original nodes.cpp
  • Added proper includes for STL containers and IR utilities
  • Contains 1720 lines of implementation code
  • +1720/-0
    Configuration changes
    CMakeLists.txt
    Updated build configuration for split files                           

    CMakeLists.txt

  • Replaced nodes.cpp with internal_base_nodes.cpp and internal_nodes.cpp
  • Updated build system to compile split translation units
  • +2/-1     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    🔒 Security concerns

    No - This is a code reorganization with no functional changes.

    ⚡ Recommended focus areas for review
    Include Dependencies

    Verify that all necessary headers are included. The file uses many IR utilities and operations (IrBuilder, IterDomainBuilder, etc.) but only includes a subset of headers. Check that no implicit dependencies exist that could cause linking issues.

    // clang-format off/* * SPDX-FileCopyrightText: Copyright (c) 2023-present NVIDIA CORPORATION & AFFILIATES. * All rights reserved. * SPDX-License-Identifier: BSD-3-Clause*/// clang-format on#include<algorithm>#include<iterator>#include<list>#include<numeric>#include<optional>#include<ranges>#include<sstream>#include<string>#include<unordered_map>#include<unordered_set>#include<utility>#include<vector>#include<ir/cloner.h>#include<ir/internal_base_nodes.h>#include<ir/iostream.h>#include<ir/utils.h>#include<ops/arith.h>#include<transform_rfactor.h>#include<transform_view.h>#include<type.h>namespacenvfuser {IterDomainBuilder::IterDomainBuilder(Val* _start, Val* _extent)    : start_(_start), extent_(_extent) {NVF_ERROR(      start_ !=nullptr && extent_ !=nullptr,"Start and extent are required to build an iter domain.");}IterDomainBuilder::IterDomainBuilder(const IterDomain* id)    : start_(id->start()),      extent_(id->extent()),      expanded_extent_(          id->hasExpandedExtent() ? id->expandedExtent() : nullptr),      stop_offset_(id->stopOffset()),      parallel_type_(id->getParallelType()),      iter_type_(id->getIterType()),      is_rfactor_domain_(id->isRFactorProduct()),      is_padded_dimension_(id->hasPaddingToMultipleOfWarp()),      is_clustered_dimension_(id->isClusteredBlockDim()),      padded_to_size_(id->getMaybeSizeAfterPadding()) {}
    Header Cleanup Completeness

    The old includes were significantly reduced. While this follows IWYU principles, ensure that all removed headers weren't providing transitive dependencies that other parts of the codebase rely on. The removed headers like lower2device.h, dynamic_transform.h, and others may be needed by other modules.

    #include<numeric>#include<sstream>#include<string>#include<unordered_map>#include<ATen/Functions.h>#include<ATen/TensorIndexing.h>#include<ATen/TensorOptions.h>#include<ATen/Utils.h>#include<ATen/core/Tensor.h>#include<ATen/cuda/CUDAContextLight.h>#include<c10/core/SymInt.h>#include<c10/cuda/CUDAGraphsC10Utils.h>#include<torch/nn/functional/embedding.h>#include<torch/nn/options/embedding.h>#include<device_lower/utils.h>#include<expr_evaluator.h>#include<ir/cloner.h>#include<ir/internal_nodes.h>#include<ir/iostream.h>#include<ir/utils.h>#include<kernel.h>#include<kernel_ir.h>#include<logical_domain_map.h>#include<ops/arith.h>#include<runtime/allocations.h>#include<transform_iter.h>#include<transform_rfactor.h>#include<transform_view.h>

    @greptile-apps
    Copy link
    Contributor

    Greptile Overview

    Greptile Summary

    Refactored monolithiccsrc/ir/nodes.cpp (6531 lines) into two focused translation units matching the existing header hierarchy:internal_base_nodes.cpp (1720 lines) forIterDomain andTensorDomain implementations, andinternal_nodes.cpp (4846 lines) for remaining IR node operations.

    Key improvements:

    • Reduced include dependencies ininternal_base_nodes.cpp by removing unnecessary CUDA/ATen headers
    • Added required CUDA runtime headers (ATen/Functions.h,ATen/TensorIndexing.h, etc.) tointernal_nodes.cpp where needed
    • UpdatedCMakeLists.txt to replace singleir/nodes.cpp with two new translation units
    • Maintained all functionality with clean separation following existing header structure (internal_base_nodes.h vs internal_nodes.h)

    The split aligns with the documented IR hierarchy and should improve build times through better dependency isolation and parallel compilation.

    Confidence Score: 5/5

    • This PR is safe to merge - it's a straightforward refactoring that splits a large file into logical units
    • Score of 5 reflects that this is a pure refactoring with no functional changes, proper build system updates, appropriate include pruning and addition, and alignment with existing header structure
    • No files require special attention

    Important Files Changed

    File Analysis

    FilenameScoreOverview
    CMakeLists.txt5/5Replaced singleir/nodes.cpp with two new translation unitsir/internal_base_nodes.cpp andir/internal_nodes.cpp to split monolithic file
    csrc/ir/internal_base_nodes.cpp5/5New translation unit containingIterDomain andTensorDomain implementations (~1720 lines) with reduced include dependencies
    csrc/ir/internal_nodes.cpp5/5New translation unit containing remaining IR node implementations (~4846 lines) including ops likeFullOp,SelectOp, etc. with necessary CUDA/ATen headers

    Sequence Diagram

    sequenceDiagram    participant Build as Build System    participant Old as nodes.cpp<br/>(6531 lines)    participant New1 as internal_base_nodes.cpp<br/>(1720 lines)    participant New2 as internal_nodes.cpp<br/>(4846 lines)    participant Headers as Header Files    Note over Old: Original monolithic file    Note over Old,Headers: Split based on header structure    Old->>New1: Extract IterDomain implementations    Old->>New1: Extract TensorDomain implementations    Note over New1: Base node types<br/>Reduced includes    Old->>New2: Extract FullOp, SelectOp, etc.    Old->>New2: Extract remaining IR nodes    Note over New2: Higher-level ops<br/>CUDA/ATen includes    Headers->>New1: #include internal_base_nodes.h    Headers->>New2: #include internal_nodes.h        New1->>Build: Register translation unit    New2->>Build: Register translation unit    Build->>Build: Remove old nodes.cpp    Build->>Build: Add internal_base_nodes.cpp    Build->>Build: Add internal_nodes.cpp
    Loading

    Copy link
    Contributor

    @greptile-appsgreptile-appsbot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others.Learn more.

    3 files reviewed, no comments

    Edit Code Review Agent Settings |Greptile

    @wujingyue
    Copy link
    CollaboratorAuthor

    !test

    Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

    Reviewers

    @naoyamnaoyamAwaiting requested review from naoyam

    @rdspring1rdspring1Awaiting requested review from rdspring1

    1 more reviewer

    @greptile-appsgreptile-apps[bot]greptile-apps[bot] left review comments

    Reviewers whose approvals may not affect merge requirements

    At least 1 approving review is required to merge this pull request.

    Assignees

    No one assigned

    Labels

    None yet

    Projects

    None yet

    Milestone

    No milestone

    Development

    Successfully merging this pull request may close these issues.

    2 participants

    @wujingyue

    [8]ページ先頭

    ©2009-2025 Movatter.jp