郭禾
开通时间:..
最后更新时间:..
点击次数:
论文类型:会议论文
发表时间:2015-07-26
收录刊物:EI、CPCI-S
卷号:9576
页面范围:113-119
关键字:DEM; GPU computing; Performance optimization; Stencil model
摘要:High performance and efficiency for parallel computing has significance in large scale discrete element method (DEM) simulation. After analyzing a simulation framework of DEM built on a Graphic Processor Unit (GPU) platform with CUDA architecture and evaluating the simulated data, we propose three optimization methods to improve the performance of a system. A stencil computation model is applied to the particle searching and calculation of forces based on gridding to formulate the structure in the particle-particle contact and neighboring particle searching. In addition, a reasonable and effective parallel granularity is sought out by altering the number of blocks and threads on GPU. A shared-memory environment is set up for data prefetching and storing the results of intermediate calculations by a rational analysis and calculations. The results of the experiment show that the stencil model is useful for the particle searching and calculation of forces and the rational parallel granularity as well as the fair use of shared memory optimizes the performance of the DEM simulation frame-work.