julia系列10:apple m1 gpu编程

1. julia安装

首先去官网,找到对应的版本下载,苹果m1芯片对应的M-series
在这里插入图片描述
然后打开julia,安装IJulia:

using Pkg
Pkg.add("IJulia")

然后打开jupyter,就可以新建julia页面了。

2. m1 gpu编程库:Metal

安装:

julia> import Pkg; Pkg.add("Metal")
julia> using Metal

使用方式和cuda库极为相似,参考下面这个例子:

function vadd(a, b, c)
   i = thread_position_in_grid_1d()
   c[i] = a[i] + b[i]
   return
end

a = MtlArray([1]); b = MtlArray([2]); c = similar(a);
@metal threads=length(c) vadd(a, b, c)
synchronize()

接下来我们用多项式计算对比测试一下性能:

using Metal
using Polynomials
function mpoly(a,c)
    i = thread_position_in_grid_1d()
    coef = 1:9
    for j in coef
        c[i] = j + c[i]*a[i]
    end
    return
end

function poly(a,c)
    p = Polynomial(9:-1:1) 
    for j in 1:size(a)[1]
        c[j]=p(a[j])
    end
    return
end

testa = MtlArray{Float32}(rand(10))
testc = similar(testa)
a = rand(1024*2048*64);
c = similar(a);
ma = MtlArray{Float32}(a);
mc = similar(ma);

@metal threads=10 grid=1  mpoly(testa,testc)
poly(testa,testc)

先小规模运行一遍,然后进行时间对比:

@time @metal threads=1024 grid=2048*64  mpoly(ma,mc)
@time poly(a,c)

结果为:
0.000421 seconds (132 allocations: 3.680 KiB)
0.604166 seconds (71.97 k allocations: 3.755 MiB, 3.64% compilation time)
耗时相差1400多倍,性能提升突出啊~

猜你喜欢

转载自blog.csdn.net/kittyzc/article/details/126493124