-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@floop slows down function #10
Comments
This improves the type inference result for #10. However, it does not improve the performance of it.
I think this is a really nice example. I got inference working with #11 but the performance is not great yet (it looks like the inliner just give up). There are a few other ideas to try. |
Required for JuliaFolds/FLoops.jl#10
OK, with JuliaFolds/Transducers.jl#282, I now have julia> @btime prime_factors1($98623589);
1.500 ms (4 allocations: 208 bytes)
julia> @btime prime_factors3($98623589);
1.850 ms (206329 allocations: 6.30 MiB) With a bit of cheating (you don't need to put function prime_factors3_cheating(n)
factors = Int[]
@floop begin
n
for i in Iterators.flatten((2, 3:2:n))
while n % i == 0
push!(factors, i)
n = div(n, i)
end
n == 1 && break
end
end
factors
end I get julia> @btime prime_factors3_cheating($98623589);
724.095 μs (2 allocations: 128 bytes) I think it's possible to automatically exclude |
Just as a reference, the unrolled version https://julialang.zulipchat.com/#narrow/stream/225542-helpdesk/topic/Fast.20chained.20iterator.3F/near/197555379 function prime_factors2(n::Integer)
factors = Int[]
while n % 2 == 0
push!(factors, 2)
n = div(n, 2)
end
n == 1 && return factors
for i in 3:2:n
while n % i == 0
push!(factors, i)
n = div(n, i)
end
n == 1 && return factors
end
factors
end is still faster: julia> @btime prime_factors2($98623589);
653.200 μs (2 allocations: 128 bytes) |
I assume the macro expands to basically the equivalent of my attempt at a hand-written transducers solution when moving function prime_factors4(n)
factors = Int[]
foldl(Cat(), (2, 3:2:n); init=n) do n, i
while n % i == 0
push!(factors, i)
n ÷= i
end
n == 1 ? reduced(n) : n
end
factors
end There was a large performance gap between the julia> function prime_factors5(n)
factors = Int[]
foreach(Cat(), (2, 3:2:n)) do i
while n % i == 0
push!(factors, i)
n ÷= i
end
n == 1 && reduced()
end
factors
end
prime_factors5 (generic function with 1 method)
julia> @btime prime_factors1($98623589);
1.786 ms (4 allocations: 208 bytes)
julia> @btime prime_factors4($98623589);
1.643 ms (5 allocations: 176 bytes)
julia> @btime prime_factors5($98623589);
16.747 ms (410286 allocations: 6.26 MiB) |
Yeah, that's right.
As you said, there is the boxing problem so I don't think it's possible to make the |
* Inline more transducer methods Required for JuliaFolds/FLoops.jl#10 * Remove "broken annotation" from the now-passing test
This improves the type inference result for #10. However, it does not improve the performance of it.
As far as I can tell,
prime_factors3
is the@floop
equivalent ofprime_factors1
, but it runs significantly slower with a lot more allocations:The text was updated successfully, but these errors were encountered: