NotificationsYou must be signed in to change notification settings
Fork26.3k
Star96k

[AOTI] Add num_runners to AOTIModelPackageLoader#149364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

desertfire wants to merge3 commits intogh/desertfire/556/basefromgh/desertfire/556/head

Closed

[AOTI] Add num_runners to AOTIModelPackageLoader#149364

desertfire wants to merge3 commits intogh/desertfire/556/basefromgh/desertfire/556/head

Conversation

Copy link

Contributor

desertfire commentedMar 18, 2025•
edited
Loading

Stack fromghstack (oldest at bottom):

->[AOTI] Add num_runners to AOTIModelPackageLoader #149364

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.

cc@voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

Differential Revision:D71357418

[AOTI] Add num_runners to AOTIModelPackageLoader

0cc3fdc

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.[ghstack-poisoned]

Copy link

pytorch-botbot commentedMar 18, 2025•
edited
Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results athud.pytorch.org/pr/149364

📄 PreviewPython docs built from this PR
📄 PreviewC++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit thebot commands wiki or ouroffice hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commitc0ccb40 with merge basefdacf3c ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge) (gh) (#144480)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

desertfire added a commit that referenced this pull request

Mar 18, 2025

[AOTI] Add num_runners to AOTIModelPackageLoader

3472315

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.ghstack-source-id:1e414d1Pull Requestresolved:#149364

pytorch-botbot added ciflow/inductor module: inductor labels

Mar 18, 2025

Copy link

ContributorAuthor

desertfire commentedMar 18, 2025

@desertfire has imported this pull request. If you are a Meta employee, you can view this diffon Phabricator.

pytorch-botbot added the ciflow/trunkTrigger trunk jobs on your pull request label

Mar 18, 2025

desertfire requested a review fromangelayi

March 18, 2025 00:09

desertfire added topic: improvements

topic category

release notes: inductor labels

Mar 18, 2025

desertfire mentioned this pull request

Mar 18, 2025

[AOTInductor]Only support one model instance when use AOTIModelPackageLoader load aot model?#148937

Closed

angelayi approved these changes

Mar 18, 2025

View reviewed changes

Update on "[AOTI] Add num_runners to AOTIModelPackageLoader"

c72424d

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundovDifferential Revision: [D71357418](https://our.internmc.facebook.com/intern/diff/D71357418)[ghstack-poisoned]

desertfire added a commit that referenced this pull request

Mar 18, 2025

[AOTI] Add num_runners to AOTIModelPackageLoader

f788e31

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.ghstack-source-id:3836f60Pull Requestresolved:#149364

Copy link

ContributorAuthor

desertfire commentedMar 18, 2025

@desertfire has imported this pull request. If you are a Meta employee, you can view this diffon Phabricator.

Update on "[AOTI] Add num_runners to AOTIModelPackageLoader"

c0ccb40

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundovDifferential Revision: [D71357418](https://our.internmc.facebook.com/intern/diff/D71357418)[ghstack-poisoned]

desertfire added a commit that referenced this pull request

Mar 18, 2025

[AOTI] Add num_runners to AOTIModelPackageLoader

35595be

Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference.ghstack-source-id:ccac029Pull Requestresolved:#149364

Copy link

ContributorAuthor