- Notifications
You must be signed in to change notification settings - Fork97
feat: Add leader election#442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
codecovbot commentedFeb 13, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Codecov Report
@@ Coverage Diff @@## master #442 +/- ##==========================================- Coverage 49.33% 49.27% -0.06%========================================== Files 56 56 Lines 4966 4980 +14 ==========================================+ Hits 2450 2454 +4- Misses 2265 2275 +10 Partials 251 251
Flags with carried forward coverage won't be shown.Click here to find out more.
Continue to review full report at Codecov.
|
mikesmithgh commentedFeb 13, 2022
HI@grzesuav and@AmitKumarDas, could you please review when you have free time? I left this in draft because I have TODOs around naming and will update after feedback. |
grzesuav commentedFeb 14, 2022
hi @mjsmith1028 , it would be good to have consistency withhttps://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/leaderelection/leader_election.go. About cmd line options, I would make them the same as config options (i.e. Thanks for PR! |
AmitKumarDas commentedFeb 16, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@mjsmith1028 Did you run some tests to verify if nothing breaks. Is it feasible for you to add some tests / examples that can verify this behaviour? |
mikesmithgh commentedFeb 17, 2022
@grzesuav Thanks! I updated to expose all the leader election options and defaulted to the preferred resource lock leases. I'll add some documentation around this. I am wondering if there is any concern about the growing number of arguments passed to metacontroller or if you are fine with that approach? |
mikesmithgh commentedFeb 17, 2022
@AmitKumarDas I only ran some small manual testing around this to verify a leader was elected and the other replica waits until the leader is in a bad state before becoming the new leader. I can definitely add an example. I'm not 100% sure on the approach I should take for testing but I'll think on it. Do you have any ideas? |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
ea80314 tob2019f0Comparemikesmithgh commentedMar 4, 2022
Hi@AmitKumarDas and@grzesuav, I cleaned up this PR and it is ready to review. Please let me know if you have anymore feedback. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
2d9198f to481f378Comparemikesmithgh commentedMar 7, 2022
FYI I rebased to clean up the commit history. |
Signed-off-by: Mike Smith <10135646+mjsmith1028@users.noreply.github.com>
grzesuav left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks good and clear, thanks for this PR !
grzesuav commentedMar 8, 2022
🎉 This PR is included in version 2.3.0 🎉 The release is available onGitHub release Yoursemantic-release bot 📦🚀 |
Uh oh!
There was an error while loading.Please reload this page.
I would like to add leader election. Due to node autoscaler and other variables in the cluster, the metacontroller pod can get rolled/shifted to other nodes. Typically, this does not cause any problems and metacontroller comes up fast. However, we occasionally have issues with inconsistency between environment and bad nodes which may cause metacontroller to be stuck in a pending state for a while. This would allow for more coverage and reduce the chance of downtime in this scenario.