CUDA reduce 并行规约求和

NoSuchKey