hpl

HPL

HPL(The High-Performance Linpack Benchmark)是测试高性能计算集群系统浮点性能的基准。HPL通过对高性能计算集群采用高斯消元法求解一元N次稠密线性代数方程组的测试,评价高性能计算集群的浮点计算能力。

浮点计算峰值是指计算机每秒可以完成的浮点计算次数,包括理论浮点峰值和实测浮点峰值。理论浮点峰值是该计算机理论上每秒可以完成的浮点计算次数,主要由CPU的主频决定。理论浮点峰值=CPU主频×CPU核数×CPU每周期执行浮点运算的次数。

1. 作业提交参数说明

用户可通过公共模板提交HPL作业,与HPL相关的作业参数如下:

参数 描述
aHPL Ns 脚本文件求解的矩阵数量与规模,一般N×N×8=系统总内存×80%
HPL NBs 求解矩阵过程中矩阵分块的大小,一般小于384,NB×8一定是缓存行的倍数
HPL PS 水平方向处理器个数,P≤Q,P建议选择2的幂
HPL QS 垂直方向处理器个数,P*Q等于cpu总核数

2. HPL作业运行参考

测试前您需要在本地准备好算例文件HPL.dat,文件包含了HPL运行的参数。

1.HPL.dat文件:

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)  #输出文件名
1            device out (6=stdout,7=stderr,file)    #值不为6和7,则输出为上一行的文件
1            # of problems sizes (N)
77852        Ns
1            # of NBs
128          NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
4            Ps
8            Qs
16.0         threshold
3            # of panel fact
0 1 2        PFACTs (0=left, 1=Crout, 2=Right)
2            # of recursive stopping criterium
2 4          NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
3            # of recursive panel fact.
0 1 2        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)

2.执行

mpirun --allow-run-as-root ./xhpl

3.执行结果(取其中一条)

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00L2L2       77852   128     2    16            1470.20              2.140e+02    #在N=77856 NB128 P=2 Q=16时的值为2.140e+02(Gflops),即214.0
HPL_pdgesv() start time Tue Aug 10 10:53:02 2021

HPL_pdgesv() end time   Tue Aug 10 11:17:32 2021

hpl测试效率计算器:http://hpl-calculator.sourceforge.net/

4.input 文件

#job_name=lmp
#run_time=24:00:00
partition=dell_intel
node_num=3
task_per_node=16
HPL_Ns=60000
HPL_NBs=192
HPL_PS=4
HPL_QS=12
tmp_dir=/home/wushiming/hpl
#work_dir=/home/wushiming/hpl

5.执行脚本

#!/bin/sh
source /home/wushiming/hpl/hpl_input

##check input var
time=`date +%m%d_%H%M%S`

if [ "x$job_name" == "x" ];then
    sbatch_job_name="YHPC_$time "
else
    sbatch_job_name=$job_name
fi

if [ "x$partition" == "x" ];then
    sbatch_partition=""
else
    sbatch_partition=$partition
fi

if [ "x$work_dir" == "x" ];then
    mkdir -p /home/yhpc/YHPC_$time
    sbatch_work_dir=/home/yhpc/YHPC_$time
else
    sbatch_work_dir=$work_dir
fi

if [ "x$run_time" == "x" ];then
    sbatch_run_time=03:00:00
else
    sbatch_run_time=$run_time
fi

sbatch_node_num=$node_num
sbatch_task_per_node=$task_per_node

sbatch_err_log=$sbatch_work_dir/%j.err
sbatch_out_log=$sbatch_work_dir/%j.out


cp $tmp_dir/xhpl $sbatch_work_dir

cat > $sbatch_work_dir/HPL.dat <<EOF
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
8            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
$HPL_Ns          Ns
1            # of NBs
$HPL_NBs      NBs
1            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
$HPL_PS          Ps
$HPL_QS          Qs
16.0         threshold
1            # of panel fact
2 1 0        PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
2            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1 0 2        RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
0            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
0            DEPTHs (>=0)
0            SWAP (0=bin-exch,1=long,2=mix)
1            swapping threshold
1            L1 in (0=transposed,1=no-transposed) form
1            U  in (0=transposed,1=no-transposed) form
0            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
EOF

### Write basic job infomations
#echo -e "The start time is: `date +"%Y-%m-%d %H:%M:%S"` \n"
#echo -e "My job ID is: $SLURM_JOB_ID \n"
#echo -e "The total cores is: $total_cores \n"
#echo -e "The hosts is: \n"
#srun -np $node_num -nnp 1 hostname
cat > $sbatch_work_dir/hpl.slurm <<EOF
#!/bin/bash
#SBATCH --ntasks-per-node=$sbatch_task_per_node
#SBATCH --job-name $sbatch_job_name
#SBATCH --nodes=$sbatch_node_num
#SBATCH --mail-type=ALL
#SBATCH --partition $sbatch_partition
#SBATCH --chdir=$sbatch_work_dir
#SBATCH -e $sbatch_err_log
#SBATCH -o $sbatch_out_log

ulimit -s unlimited
ulimit -l unlimited

# 导入运行环境
module purge
module use /opt/ohpc/pub/modulefiles
source /opt/ohpc/pub/apps/intel/setvars.sh
module load intel/mpi-2021.1.1

export I_MPI_OFI_PROVIDER=Verbs
export FI_VERBS_IFACE=team1.282

echo -e "The start time is: `date +"%Y-%m-%d %H:%M:%S"`"
echo -e "My job ID is: \$SLURM_JOB_ID"
echo -e "The total cores is: \$SLURM_NPROCS"
echo -e "The \$SLURM_JOB_ID Job info:"
scontrol show job \$SLURM_JOB_ID

mpirun -genv I_MPI_FABRICS ofi ./xhpl

echo -e "The end time is: \`date +"%Y-%m-%d %H:%M:%S\`"
EOF

#sed -i 's/SLURM*/\$SLURM/g' $sbatch_work_dir/hpl.slurm
/usr/bin/sbatch $sbatch_work_dir/hpl.slurm

个结果匹配 ""

    无结果匹配 ""