Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
418 views
in Technique[技术] by (71.8m points)

optimization - Efficiently filling a block in a matrix using StaticArrays

When using static arrays, I noticed that there is a huge efficiency loss depending on how you convert/create static matrices:

using StaticArrays

code_fast() = SMatrix{2,2}(1.0 , 1.0 , 1.0 , 1.0); code_fast(); @btime code_fast();
>> 0.026 ns (0 allocations: 0 bytes)

code_slow() = SMatrix{2,2}([1.0  1.0 ; 1.0  1.0]); code_slow(); @btime code_slow();
>> 45.371 ns (1 allocation: 112 bytes)

The slow code can be easily corrected, however, by setting

code_slow() = @SMatrix [1.0  1.0 ; 1.0  1.0]

for which it is not slow anymore and gets as fast as code_fast. The problem is that for using the @SMatrix macro one must really write the components of matrix to be converted by hand, or else use rand, zeros or ones. If this cannot be done, then I could find no option besides using SMatrix{n,n}, which renders using static arrays so slow they become useless.

Concretely, the type of functions I need to optimize look like

test(n,G,mat) = begin mat[1+n:2n,1:n] = G; mat end

where G is a complicated static matrix that is calculated inside another function. I need to stuff this matrix inside mat as a block, and I need the output to be also static (this operation is performed billions of times). By converting to a static one at the end, that is:

test_SA(n,G,mat)  = begin mat[1+n:2n,1:n] = G; SMatrix{2n,2n}(mat) end

I run into the aforementioned problem of SMatrix being too slow:

n=2
G = @SMatrix ones(Float64,n,n)
mat = Matrix{Float64}(I,2n,2n)

test(n,G,mat); @btime test($n,$G,$mat);          # 17.821 ns (0 allocations: 0 bytes)
test_SA(n,G,mat); @btime test_SA($n,$G,$mat_mm); # 1.825 μs (22 allocations: 1.36 KiB)

Can someone please explain what is going on, and how to correct it? I would really like to use static arrays here, since all the other modules in the code run much faster precisely because of them.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Like the name indicates the size of a StaticArray should be static and in your code it is dynamic not static. Your problem can be narrowed to:

julia> @btime  SMatrix{2*$n,2*$n}($mat);
  1.310 μs (22 allocations: 1.36 KiB)

What to do about it? The first option is to have fixed sizes that will be known at the compile time:

julia> @btime SMatrix{4,4}($mat);
  1.599 ns (0 allocations: 0 bytes)

This will be not always convenient so fortunately there is another reasonable option - create the target type beforehead (and perhaps re-use it 1 billion times).

MySMatrix =  SMatrix{2n,2n}

Once created it will be blazing fast:

julia> @btime $MySMatrix($mat)
  1.799 ns (0 allocations: 0 bytes)
4×4 SArray{Tuple{4,4},Float64,2,16} with indices SOneTo(4)×SOneTo(4):
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 1.0  1.0  1.0  0.0
 1.0  1.0  0.0  1.0

If you are generating the static arrays you must also take a look at sacollect which as fast as hell as well:

julia> @btime StaticArrays.sacollect($MySMatrix, i*j for i in 1:4, j in 1:4)
  1.399 ns (0 allocations: 0 bytes)
4×4 SArray{Tuple{4,4},Int64,2,16} with indices SOneTo(4)×SOneTo(4):
 1  2   3   4
 2  4   6   8
 3  6   9  12
 4  8  12  16

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...