-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
calling caputres_iter
in an async
function may cause memory leak
#1214
Comments
This doesn't imply a leak. When you run a regex, it has to allocate space to run the search. This space is tied to the When you clone the The expectation that all memory allocated during a search gets released is just wrong. A memory leak is something that happens when memory used grows without bound given an otherwise fixed set of resources. |
Thanks. Cloning the original regex object does indeed avoid the growth of memory usage. - for (i, caps) in REG.captures_iter(&text).enumerate() {
+ for (i, caps) in REG.clone().captures_iter(&text).enumerate() { // Total: 1.3M
for (i, caps) in REG.clone().captures_iter(&text).enumerate() { // <--- 0.8M
// ...
} According to the regex documentation, we should indeed avoid using regex across threads. So this is not a bug. Close this issue. |
Using a regex across threads is fine. And cloning a regex for every search is almost certainly wrong and will lead to much much slower searches. Your peak memory usage should generally be the same in either case. |
And that section in the regex docs is specifically about contention on shared mutable resources leading to slower searches. Not memory usage. |
If I execute
Is there an upper limit to the memory occupied by regex? |
After testing, the memory usage will not keep growing when executing the code snippet below (no regex cloning here). + fn get_tag_positions(src: &str) -> Vec<(usize, usize)> {
+ REG
+ .captures_iter(src)
+ .map(|caps| {
+ let m = caps.get(0).unwrap();
+ (m.start(), m.end())
+ })
+ .collect::<Vec<_>>()
+ }
// ...
- for (i, caps) in REG.captures_iter(&text).enumerate() {
+ for (i, (start, end)) in get_tag_positions(&text).into_iter().enumerate() { |
I finally had a chance to take a quick peek at this, and your MRE is over-complicated. So I trimmed it down to this: use std::time::Duration;
use once_cell::sync::Lazy;
use regex::Regex;
static REG: Lazy<Regex> = Lazy::new(|| Regex::new(r#"(Object\.\w+)"#).unwrap());
async fn memory_leak_demo() -> Vec<u8> {
let text = tokio::fs::read_to_string("vue.cjs.js").await.unwrap();
let mut buf = Vec::from(text.as_bytes());
for (i, caps) in REG.captures_iter(&text).enumerate() {
println!("process text: {}", i + 1);
tokio::time::sleep(Duration::from_millis(50)).await;
let m = caps.get(0).unwrap();
let start = m.start();
let end = m.end();
let text = &text[start..end];
buf.extend(text.as_bytes());
}
buf
}
async fn task() {
for _ in 0..200_000 {
let buf = memory_leak_demo().await;
let s = String::from_utf8(buf).unwrap_or_default();
println!("string length: {}", s.len());
}
}
#[tokio::main]
async fn main() {
tokio::spawn(async {
task().await;
})
.await
.unwrap();
println!("done");
} With this
Notice that Your measurement process is likely flawed. Diagnosing a memory leak is hard. But the fact that you're seeing differences between async and sync should be a tip-off that perhaps there is something about async that is causing higher peak memory usage for you. For example, if you have a bunch of tasks running simultaneously, perhaps with some paused because of your If you want to demonstrate a leak, then please provide a simple program that I can run where the resident memory usage grows without bound while otherwise using fixed resources as reported by a process monitor like |
Yes, as I said, the likely reason here is that in your "synchronous" version, there are no await points while you're collecting the matches in your regex. So your async task cannot be stopped while you're consuming So I think the lesson here is to be careful of introducing await points while consuming a regex match iterator, because a regex match iterator has memory associated with it that can't be shared with another iterator so long as it's live. |
I've been debugging a memory "leak" that I only finally tracked down to this allocation of memory. The application I maintain (the rust port of the matomo device detector) has tens of thousands of regexes and they must be run concurrently, but locking every individual regex is not great, and cloning before every match is not great, trying to somehow reinitialize a regex through Lazy periodically might be doable, somehow (maybe), limiting the concurrency of queries to the system is really not great, but I'm really not sure what else to do, as the memory gradually creeps up over the span of weeks. Maybe if the cache statistics on the amount of memory stored was accessible somehow through the regex and could be cleared occasionally, somehow? |
@mindreader It might be worth pursuing a compaction strategy: regex/regex-automata/src/util/pool.rs Lines 318 to 320 in ab88aa5
Or alternatively, we could be more aggressive about throwing away scratch memory here: regex/regex-automata/src/util/pool.rs Lines 607 to 625 in ab88aa5
Today, we only throw memory away when there is too much contention on the shared resource. But maybe there should be other cases where we throw it away. It seems hard to pick a correct heuristic for this. And adding a method to "clear cached memory because it might get too big over time" seems pretty unfortunate. I'm going to re-open this and think about it a bit more. |
Thanks for pointing out the relevant code. I will definitely take a closer
look.
|
Another point to consider is that the memory increase should only occur if you're iterating over matches, and within the handling of each element there are await points. If there aren't any await points, there there isn't any opportunity for the scratch memory to be held by a task and not sent back to the pool. This happens because memory is retrieved at the start of iteration and not put back until the iterator is dropped. This is done to avoid needing to retrieve this memory for each search. But perhaps this is the wrong trade-off to make when each iteration has an await point. |
Everything you say tracks, it shouldn't lose memory at all, but it seems to in my case. But I need to find a truly minimal test case before I can even begin to understand why. I had no idea how much effort went into this. While I intend to experiment with this, I have little hope that I will discover something you haven't, so feel free to close this, and if I do by miracle find something workable I promise I will respond or open a PR. |
Do you have await points in your captures_iter loop? |
My case isn't exactly like the op, I'm not capturing over await points per se, but rather operating on single super large collection of Regex's shared via an Arc in separate threads. There are actually no awaits in code, just in the outer loop, no data shared between loops. That means that every regex should have allocated its own scratch basically, and that scratch should be reclaimed except for the "owner" thread. But the results from the tests I'm running don't really make sense to me. I have far too many variables in my code to consider this a worthwhile bug report, please I beg, just let me get some spare time to try and remove everything that could be complicating this before taking anything I say seriously. |
Aye, understood. Thank you for investigating! |
I had some time this weekend to run a lot of tests and now I understand how it all works. It isn't a loss of memory, it is just that over time if your application uses a lot of different threads, you will accrue 8 full cache lines on each regex and they will never go away. It isn't a matter of lots of concurrency, merely accessing via many threads gradually will eventually fill up the cache, even if each one is accessed one thread at a time. That's why this happens to me over a span of weeks. The following change was enough to reduce my memory usage in real world scenarios by 50%+ while increasing cpu usage by only 15%, a tradeoff I would happily accept, except for the difficulty of exposing that level of configurability without forking.
So we get rid of the owner thread cache because it is basically used once and never again in this scenario. And we remove all but the first cache since it will almost definitely never need more than one at a time, and I'm willing to allocate in the case that we do. Since memory can only be reclaimed upon use of the regex, that greatly limits the ways in which you could optimize this. I thought of maybe a cache where each cache line was thread independent but it would take some real time to validate this theory and I have a feeling it would definitely hurt some benchmarks. |
Ah I see. That's interesting. I do still wonder if number of threads has a role to play here though, since each cache line is a stack of reusable allocations. As long as there isn't any contention, these stacks can continue to grow I think. But I suppose growth is only possible when there is actual concurrency happening. Some thoughts:
|
It is an interesting problem. I will give it a shot if I can get enough spare time. I can't honestly give super good odds of success. |
What version of regex are you using?
Latest 1.10.5
Describe the bug at a high level.
Calling the static regular instance's
caputres_iter
in anasync
function may cause memory leak.What are the steps to reproduce the behavior?
What is the actual behavior?
The memory allocated during the execution of the
captures_iter
method was not released. I dump the memory profile file during the program running. https://github.com/ziyoung/memory-leak-example?tab=readme-ov-file#regex-memory-issueIf I change my code like follwings, there's no memory leak. https://github.com/ziyoung/memory-leak-example/pull/1/files#diff-42cb6807ad74b3e201c5a7ca98b911c5fa08380e942be6e4ac5807f8377f87fc
What is the expected behavior?
There's no memory leak.
The text was updated successfully, but these errors were encountered: