-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](cloud) shorten cache lock held time and add metrics #47472
base: master
Are you sure you want to change the base?
Conversation
when update bvar metrics, we held block lock in the critical context of cache lock, make the later lock held too long and affect other cache logic. we use unsafe method to update the bvar to boost performance. some key metrics of lock and other meaningful metrics are also added for better monitoring cache time costs. Signed-off-by: zhengyu <[email protected]>
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
TPC-H: Total hot run time: 32396 ms
|
TPC-DS: Total hot run time: 192068 ms
|
ClickBench: Total hot run time: 30.08 s
|
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
@@ -59,6 +59,10 @@ FileBlock::State FileBlock::state() const { | |||
return _download_state; | |||
} | |||
|
|||
FileBlock::State FileBlock::state_unsafe() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this only works on x86
DEFINE_mInt64(cache_lock_wait_long_tail_threshold_us, "30000000"); | ||
DEFINE_mInt64(cache_lock_held_long_tail_threshold_us, "30000000"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30000000 means 30 seconds, it seems too long...
when update bvar metrics, we held block lock in the critical context of cache lock, make the later lock held too long and affect other cache logic. we use unsafe method to update the bvar to boost performance.
some key metrics of lock and other meaningful metrics are also added for better monitoring cache time costs.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)