- Notifications
You must be signed in to change notification settings - Fork6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
[RLlib] Optimize rnn_sequencing performance#46502
Open
cpnota wants to merge1 commit intoray-project:masterChoose a base branch fromcpnota:master
base:master
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
|
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
We found the performance of LSTMs in Rllib to be extremely slow compared to other methods, with a single training iteration of PPO taking 179 seconds (compared to ~9 seconds with a similarly-sized MLP network). This made RNNs/LSTMs, as well as some transformer implementations, completely unusable for our purposes.
However, when profiling, we found this was primarily due to a very slow copy operation:
Further investigation revealed that most of this runtime was spent copying the
infos
dict. We determined that the root cause was inconsistent handling of the dictionary inrnn_sequencing
. While the non-recurrent implementation stores the list of dictionaries as a NumPy array of objects,rnn_sequencing
instead stores it as a Python list:We applied a one-line fix to make this behavior consistent and store the list as NumPy array:
This causes the
copy
function to perform a shallow copy, drastically improving performance by ~6x to around 29 seconds:However, we found that the training loop was still spending a lot of time in rnn_sequencing. We traced this down to a slow element-wise copy into an array. We instead replaced this with a vectorized copy:
This further improved performance (in this sample, we also removed the 1-time summary logging, but did not remove this in this PR):
Combined, these changes reduced the run time of the training step from 179 seconds to 15 seconds, approximately a 12x speedup and competitive with training an MLP.
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.