Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up strain rate and deviatoric stress kernels #110

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

albert-de-montserrat
Copy link
Member

@albert-de-montserrat albert-de-montserrat commented Sep 21, 2023

Powers with floating point exponents are quite slow, and this is where mos of the time is spent when computing the stress rate or deviatoric stress of creep laws. However, in many cases the exponentials in these equations are just x^0, x^1 or 1^n, where there is not really a need to compute the power. Addresses #109

This PR adds a check to see whether any of these cases occurs, by introducing:

@inline function pow_check(x::T, n) where T
    if isone(x) || isone(n)
        x
    elseif iszero(n)
        one(T)
    else
        fastpow(x, n)
    end
end

in all the compute_εII and compute_τII methods.

This has a non-significant speed up when e.g. computing the viscosity:

εII = 1e-15
τII = 1e6
P, T = 1e9, 1e3;
args = (; P = P, T = T);

# create rheology struct
diff     = DiffusionCreep()
disl     = DislocationCreep()
rheology = SetMaterialParams(;
    CompositeRheology = CompositeRheology((diff, disl)),
)

This branch:

In [31]: @btime compute_viscosity_εII($(rheology, τII, args)...)
  34.542 ns (0 allocations: 0 bytes)
0.4345077461371434
In [32]: @btime compute_viscosity_τII($(rheology, τII, args)...)
  20.240 ns (0 allocations: 0 bytes)
0.4345077461371434

#main branch:

In [51]: @btime compute_viscosity_εII($(rheology, τII, args)...)
  50.557 ns (0 allocations: 0 bytes)
0.4345077461371434

In [52]: @btime compute_viscosity_τII($(rheology, τII, args)...)
  43.145 ns (0 allocations: 0 bytes)
0.4345077461371434

This could be potentially improved by moving some of these checks to compile time, but this means wraping the exponentials in the creep laws objects in Vals:

@inline pow_check2(::T, ::Val{0.0}) where T = one(T)
@inline pow_check2(::T, ::Val{0}) where T = one(T)
@inline pow_check2(x, ::Val{1.0}) = x
@inline pow_check2(x, ::Val{1}) = x
@inline pow_check2(x, ::Val{N}) where N =  pow_check2(x, N)

@inline function pow_check2(x::T, n) where T
    if isone(x)
        x
    else
        fastpow(x, n)
    end
end
In [33]: V0 = Val(0.0)
Val{0.0}()

In [34]: @btime pow_check($(1.5, 0.0)...)
  6.500 ns (0 allocations: 0 bytes)
1.0

In [35]: @btime pow_check2($(1.5, V0)...)
  1.600 ns (0 allocations: 0 bytes)
1.0

In [36]: V1 = Val(1.0)
Val{1.0}()

In [37]: @btime pow_check($(1.5, 1.0)...)
  6.400 ns (0 allocations: 0 bytes)
1.5

In [38]: @btime pow_check2($(1.5, V1)...)
  2.700 ns (0 allocations: 0 bytes)
1.5

In [39]: @btime pow_check($(1.5, 2.3)...)
  14.515 ns (0 allocations: 0 bytes)
2.5410306047779248

In [40]: @btime pow_check2($(1.5, Val(2.3))...)
  12.800 ns (0 allocations: 0 bytes)
2.5410306047779248

@albert-de-montserrat albert-de-montserrat merged commit ec3b099 into main Sep 22, 2023
14 of 20 checks passed
@albert-de-montserrat albert-de-montserrat deleted the adm-powcheck branch September 22, 2023 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant