玩转 gpgpu-sim 02记 —— 构建了什么

编译gpgpu-sim 需要先运行脚本 setup_environment , source setup_environment，注释如下，主要是设置一些 Makefile中会用到的环境变量。示例程序 RAY 运行时调用了 gpgpu-sim 的什么内容。非常短，加注释从共200多行。

Eloudy

569人浏览 · 2023-09-19 11:34:18

Eloudy · 2023-09-19 11:34:18 发布

官方文档：

GPGPU-Sim 3.x Manual

1. 设置环境变量

编译gpgpu-sim 需要先运行脚本 setup_environment , source setup_environment，注释如下，主要是设置一些 Makefile中会用到的环境变量

# see README before running this
# 下面这句用来检测当前的shell环境是不是 bash 或者 sh 或者 zsh，不支持除三者之外的其他 shell
ps -p $$ | awk '/bash/ || / sh/ || /zsh/ {exit 1;}' && echo "ERROR ** source setup_environment must be run in a bash, zsh or sh shell; see README" && exit

# 下面这变量用来标识 成功运行过了 source setup_environment,并在本脚本末尾将这个变量置 1；
export GPGPUSIM_SETUP_ENVIRONMENT_WAS_RUN=
# 下面，将本脚本所在的路径赋值给 GPGPUSIM_ROOT
export GPGPUSIM_ROOT="$( cd "$( dirname "$BASH_SOURCE" )" && pwd )"
# 下面，通过源代码根目录下的 version 文件中的内容，获得gpgp-sim的版本号，存储于 GPGPUSIM_VERSION_STRING；而变量 GPGPUSIM_BUILD_STRING 将根据 version 的文件内容而获得空值；
GPGPUSIM_VERSION_STRING=`cat $GPGPUSIM_ROOT/version | awk '/Version/ {print $8}'`
GPGPUSIM_BUILD_STRING=`cat $GPGPUSIM_ROOT/version | awk '/Change/ {print $6}'`
# 下面，将在终端中输出这两个变量的值，作为提示信息；
echo -n "GPGPU-Sim version $GPGPUSIM_VERSION_STRING (build $GPGPUSIM_BUILD_STRING) ";

# 下面，错误检查；判断 CUDA_INSTALL_PATH 是否为空；这是一个需要根据本机的 cuda 环境信息预先设置的变量，一般是 export CUDA_INSTALL_PATH=/usr/local/cuda
if [ ! -n "$CUDA_INSTALL_PATH" ]; then
	echo "ERROR ** Install CUDA Toolkit and set CUDA_INSTALL_PATH.";
	return;
fi
# 下面，错误检查；判断 CUDA_INSTALL_PATH 所表示的文件夹在系统中是否确实存在
if [ ! -d "$CUDA_INSTALL_PATH" ]; then
	echo "ERROR ** CUDA_INSTALL_PATH=$CUDA_INSTALL_PATH invalid (directory does not exist)";
	return;
fi
# 下面，错误检查；gpgpu-sim 仅支持 Linux 和 Mac OS 系统， 如果不是在这两种 OS 运行此脚本则退出；意味着 GPGPUSIM_SETUP_ENVIRONMENT_WAS_RUN 将不会被置1；
if [ ! `uname` = "Linux" -a  ! `uname` = "Darwin" ]; then
	echo "ERROR ** Unsupported platform: GPGPU-Sim $GPGPUSIM_VERSION_STRING developed and tested on Linux."
	return;
fi
# 下面，去掉PATH中 跟cuda 和 gpgp-sim 相关的路径；防止多次运行本脚本时，会使得PATH内容不断重复变长，影响性能，挑战极限；
export PATH=`echo $PATH | sed "s#$GPGPUSIM_ROOT/bin:$CUDA_INSTALL_PATH/bin:##"`
# 下面，在PATH中加入跟cuda 和 gpgpu-sim相关的路径；
export PATH=$GPGPUSIM_ROOT/bin:$CUDA_INSTALL_PATH/bin:$PATH


# to run the debug build of GPGPU-Sim run:
# source setup_environment debug
# 下面，设置 NVCC_PATH 为 nvcc 的全路径，例如常常是 /usr/local/cuda/bin/nvcc
NVCC_PATH=`which nvcc`;
# 下面，错误检查；如果PATH所记录的路径下找不到nvcc这个程序，那么系统将返回非0值，可用echo $? 查看具体数值；
if [ $? = 1 ]; then
	echo "";
	echo "ERROR ** nvcc (from CUDA Toolkit) was not found in PATH but required to build GPGPU-Sim.";
	echo "         Try adding $CUDA_INSTALL_PATH/bin/ to your PATH environment variable.";
	echo "         Please also be sure to read the README file if you have not done so.";
	echo "";
	return;
fi
# 下面，从gcc --version 返回的文本的第一行中，抓出当前 gcc 的版本号
CC_VERSION=`gcc --version | head -1 | awk '{for(i=1;i<=NF;i++){ if(match($i,/^[0-9]\.[0-9]\.[0-9]$/))  {print $i; exit 0}}}'`

# 下面，跟上一个类似，从nvcc --version 的返回文本中获得 nvcc 的版本号, 这个docker image 中分别是 4.0 和 4000
CUDA_VERSION_STRING=`$CUDA_INSTALL_PATH/bin/nvcc --version | awk '/release/ {print $5;}' | sed 's/,//'`;
CUDA_VERSION_NUMBER=`echo $CUDA_VERSION_STRING | sed 's/\./ /' | awk '{printf("%02u%02u", 10*int($1), 10*$2);}'`
# 下面，版本号检查，2030 ~ 4020 之间
if [ $CUDA_VERSION_NUMBER -gt 4020 -o $CUDA_VERSION_NUMBER -lt 2030  ]; then
	echo "ERROR ** GPGPU-Sim version $GPGPUSIM_VERSION_STRING not tested with CUDA version $CUDA_VERSION_STRING (please see README)";
	return;
fi
# 下面，没有参数，故$#==0；所以 GPGPUSIM_CONFIG=gcc-4.4.7/cuda-4000/release； $1= debug, release
if [ $# = '1' ] ;
then
    export GPGPUSIM_CONFIG=gcc-$CC_VERSION/cuda-$CUDA_VERSION_NUMBER/$1
else
    export GPGPUSIM_CONFIG=gcc-$CC_VERSION/cuda-$CUDA_VERSION_NUMBER/release
fi

# 下面，这个变量没用
export QTINC=/usr/include
# 下面，试图设置 libOpenCL.so 和 cl.h 文件所在的路径，存储于变量 NVOPENCL_LIBDIR 和 NVOPENCL_INCDIR 中;
# change NVOPENCL_LIBDIR to point to your opencl library directory, usually
# /usr/lib or /usr/lib64. Not setting this variable will cause gpgpu-sim to
# build without opencl support.
if [ -f /usr/lib64/libOpenCL.so ]; then
	export NVOPENCL_LIBDIR=/usr/lib64;

	# change NVOPENCL_INCDIR to point to your opencl include directory.
	if [ -f /usr/include/CL/cl.h ]; then
		export NVOPENCL_INCDIR=/usr/include/;
	elif [ -f $CUDA_INSTALL_PATH/include/CL/cl.h ]; then
		export NVOPENCL_INCDIR=$CUDA_INSTALL_PATH/include/;
	fi
fi
# 下面，设置 LD_LIBRARY_PATH 的值,通过LD_LIBRARY_PATH修改依赖的动态库，会话全局有效；防止多次运行 本脚本，故先尝试删掉；但这里的方式是无效的，多次运行会导致变量值越来越长；
# setting LD_LIBRARY_PATH as follows enables GPGPU-Sim to be invoked by 
# native CUDA and OpenCL applications. GPGPU-Sim is dynamically linked
# against instead of the CUDA toolkit.  This replaces this cumbersome
# static link setup in prior GPGPU-Sim releases.
if [ `uname` = "Darwin" ]; then
	export DYLD_LIBRARY_PATH=`echo $DYLD_LIBRARY_PATH | sed -Ee 's#'$GPGPUSIM_ROOT'\/lib\/[0-9]+\/(debug|release):##'`
	export DYLD_LIBRARY_PATH=$GPGPUSIM_ROOT/lib/$GPGPUSIM_CONFIG:$DYLD_LIBRARY_PATH
else
	export LD_LIBRARY_PATH=`echo $LD_LIBRARY_PATH | sed -re 's#'$GPGPUSIM_ROOT'\/lib\/[0-9]+\/(debug|release):##'`
	export LD_LIBRARY_PATH=$GPGPUSIM_ROOT/lib/$GPGPUSIM_CONFIG:$LD_LIBRARY_PATH
fi

# 下面，OpenCL 先不管，远程调用NV 的 OpenCL 环境
# The following sets OPENCL_REMOTE_GPU_HOST which is used by GPGPU-Sim to
# SSH to remote node to generate PTX for OpenCL kernels when running on 
# a node that does not have an NVIDIA driver installed.
# The remote node should have GPGPU-Sim installed at the same path
if [ `uname` = "Darwin" ]; then
	HOSTNAME_PREFIX=`hostname -s`;
	export HOSTNAME_DOMAIN=`hostname | sed s/$HOSTNAME_PREFIX\.//`;
else
	HOSTNAME_DOMAIN=`hostname -d`
fi
if [ "x$HOSTNAME_DOMAIN" = "xece.ubc.ca" -a "$OPENCL_REMOTE_GPU_HOST" = "" ]; then
	export OPENCL_REMOTE_GPU_HOST=aamodt-pc05.ece.ubc.ca
fi
HOSTNAME_F=`hostname -f`
if [ "x$HOSTNAME_F" = "x$OPENCL_REMOTE_GPU_HOST" ]; then
	unset OPENCL_REMOTE_GPU_HOST
fi
# 下面，如果发现文件 gpgpu_sim.verify 存在，则证明 gpuwattch 文件夹存在，则指定 GPGPUSIM_POWER_MODEL 的值；后面两个 elif 是错误检查；
# The following checks to see if the GPGPU-Sim power model is enabled.
# GPGPUSIM_POWER_MODEL points to the directory where gpgpusim_mcpat is located.
# If this is not set, it checks the default directory "$GPGPUSIM_ROOT/src/gpuwattch/".
if [ -d $GPGPUSIM_ROOT/src/gpuwattch/ ]; then
	if [ ! -f $GPGPUSIM_ROOT/src/gpuwattch/gpgpu_sim.verify ]; then
		echo "ERROR ** gpgpu_sim.verify not found in $GPGPUSIM_ROOT/src/gpuwattch";
		return;
	fi
	export GPGPUSIM_POWER_MODEL=$GPGPUSIM_ROOT/src/gpuwattch/;
	echo "configured with GPUWattch.";
elif [ -n "$GPGPUSIM_POWER_MODEL" ]; then
	if [ ! -f $GPGPUSIM_POWER_MODEL/gpgpu_sim.verify ]; then
		echo "";
		echo "ERROR ** gpgpu_sim.verify not found in $GPGPUSIM_ROOT/src/gpuwattch/ - Either incorrect directory or incorrect McPAT version";
		return;
	fi
	echo "configure with power model in $GPGPUSIM_POWER_MODEL.";
elif [ ! -d $GPGPUSIM_POWER_MODEL ]; then
		echo "";
		echo "ERROR ** GPGPUSIM_POWER_MODEL ($GPGPUSIM_POWER_MODEL) does not exist... Please set this to the gpgpusim_mcpat directory or unset this environment variable.";
		return;
else
	echo "configured without a power model.";
fi

echo "setup_environment succeeded";
# 下面，变量置1， 向 Makefile 说明成功运行过了 setup_environment 脚本
export GPGPUSIM_SETUP_ENVIRONMENT_WAS_RUN=1

2. 一览

示例程序 RAY 运行时调用了 gpgpu-sim 的什么内容

01记中，在容器中运行了示例程序 RAY，现在在容器中查看其依赖：

# ldd /root/ispass2009-benchmarks/bin/release/RAY

可以发现，与gpgpu-sim 相关的是 libcudart.so.4 存储在：

/root/gpgpu-sim_distribution/lib/gcc-4.4.7/cuda-4000/release/libcudart.so.4

3. gpgpu-sim 的 Makefile

3.1 顶层 Makefile 注释如下

/root/gpgpu-sim_distribution/Makefile

非常短，总共200多行

# Copyright (c) 2009-2011, Tor M. Aamodt, Ali Bakhoda, Timothy Rogers, 
# Jimmy Kwa, and The University of British Columbia
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
# Redistributions in binary form must reproduce the above copyright notice, this
# list of conditions and the following disclaimer in the documentation and/or
# other materials provided with the distribution.
# Neither the name of The University of British Columbia nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


# comment out next line to disable OpenGL support
# export OPENGL_SUPPORT=1
#下面，如果变量之前没有赋值的话，在这里赋值为 intersim2
# (Temp) Using intersim2 by deafult, to use intersim, type make INTERSIM=intersim
INTERSIM ?= intersim2
#下面，定义了 6 个变量， cuda 版本，gpgpu-sim 版本，gcc 版本， g++ 版本，以及存放中间文件的 build 路径
include version_detection.mk
#下面，判断 gpgpu-sim 构建类型为 debug 还是 release，并决定是否在 CXXFLAG 中使用 -g 还是 -O3
ifeq ($(GPGPUSIM_CONFIG), gcc-$(CC_VERSION)/cuda-$(CUDART_VERSION)/debug)
	export DEBUG=1
else
	export DEBUG=0
endif
#下面，设置 BUILD_ROOT 的值为 Makefile 所在的路径
BUILD_ROOT?=$(shell pwd)
#下面，会导致编译子模块的时候，在编译器的类似 CXXFLAG 中定义一个宏，作为 cpp 代码的开关
export TRACE?=1
#下面，又又设置一遍 NVCC_PATH
NVCC_PATH=$(shell which nvcc)
#下面，设置两个文件夹的路径的变量，分别保存中间文件，和结果文件 libcudart.so
ifneq ($(shell which nvcc), "")
	ifeq ($(DEBUG), 1)
		export SIM_LIB_DIR=lib/gcc-$(CC_VERSION)/cuda-$(CUDART_VERSION)/debug
		export SIM_OBJ_FILES_DIR=$(BUILD_ROOT)/build/gcc-$(CC_VERSION)/cuda-$(CUDART_VERSION)/debug
	else
		export SIM_LIB_DIR=lib/gcc-$(CC_VERSION)/cuda-$(CUDART_VERSION)/release
		export SIM_OBJ_FILES_DIR=$(BUILD_ROOT)/build/gcc-$(CC_VERSION)/cuda-$(CUDART_VERSION)/release
	endif
endif
#下面，定义变量 LIBS，包括 4 个库文件
# cuda-sim 跟 ptx 相关的cuda 源码解析器？
# gpgpu-sim_uarch 关联到要建立 gpgpu-sim 仿真器核心lib的文件夹
LIBS = cuda-sim gpgpu-sim_uarch $(INTERSIM) gpgpusimlib 

#下面，TARGETS 的内容是本顶层 Makefile 的终极目标的依赖，所以，会被构建系统首先构建；
# TARGETS 包含了 libcudart.so, libOpenCL.so, cuobjdump_to_ptxplus
TARGETS =
ifeq ($(shell uname),Linux)
	TARGETS += $(SIM_LIB_DIR)/libcudart.so
else # MAC
	TARGETS += $(SIM_LIB_DIR)/libcudart.dylib
endif
#下面，向 TARGETS 中添加依赖
ifeq  ($(NVOPENCL_LIBDIR),)
	TARGETS += no_opencl_support
else ifeq ($(NVOPENCL_INCDIR),)
	TARGETS += no_opencl_support
else
	TARGETS += $(SIM_LIB_DIR)/libOpenCL.so
endif
	TARGETS += cuobjdump_to_ptxplus/cuobjdump_to_ptxplus

#下面，其实是 gpuwattch 的构建中间文件目录，会构建出 app mcpat：/root/gpgpu-sim_distribution/build/gcc-4.4.7/cuda-4000/release/gpuwattch/mcpat
MCPAT=
MCPAT_OBJ_DIR=
MCPAT_DBG_FLAG=
ifneq ($(GPGPUSIM_POWER_MODEL),)
	LIBS += mcpat


	ifeq ($(DEBUG), 1)
		MCPAT_DBG_FLAG = dbg
	endif

	MCPAT_OBJ_DIR = $(SIM_OBJ_FILES_DIR)/gpuwattch

	MCPAT = $(MCPAT_OBJ_DIR)/*.o
endif

#下面，gpgpusim 是这个Makefile的终极目标，它有4个依赖 check_setup_environment check_power makedirs $(TARGETS)
.PHONY: check_setup_environment check_power
gpgpusim: check_setup_environment check_power makedirs $(TARGETS)

# 下面，设置了 NVCC_PATH=/usr/local/cuda/bin/nvcc
# 检查三个环境变量是否非空，任意一个为空，则退出构建；若皆不为空，则说明 前面运行 source setup_environment 是成功的；
# 接下来会设置 NVCC_PATH, 是 nvcc 编译器程序的绝对地址（其实之前设置过了）；若设置成功就打印输出一句话：Building GPGPU-Sim version 3.2.2 (build ) with CUDA version 4.0
check_setup_environment:
	 @if [ ! -n "$(GPGPUSIM_ROOT)" -o ! -n "$(CUDA_INSTALL_PATH)" -o ! -n "$(GPGPUSIM_SETUP_ENVIRONMENT_WAS_RUN)" ]; then \
		echo "ERROR *** run 'source setup_environment' before 'make'; please see README."; \
		exit 101; \
	 else \
		NVCC_PATH=`which nvcc`; \
		if [ $$? = 1 ]; then \
			echo ""; \
			echo "ERROR ** nvcc (from CUDA Toolkit) was not found in PATH but required to build GPGPU-Sim."; \
			echo "         Try adding $(CUDA_INSTALL_PATH)/bin/ to your PATH environment variable."; \
			echo "         Please also be sure to read the README file if you have not done so."; \
			echo ""; \
			exit 102; \
		else \
			echo; echo "	Building GPGPU-Sim version $(GPGPUSIM_VERSION) (build $(GPGPUSIM_BUILD)) with CUDA version $(CUDA_VERSION_STRING)"; echo; \
	 		true; \
		fi \
	 fi 
#下面，检查 gpuwattch 的相关变量 GPGPUSIM_POWER_MODEL 是否设置成功；
check_power:
	@if [ -d "$(GPGPUSIM_ROOT)/src/gpuwattch/" -a ! -n "$(GPGPUSIM_POWER_MODEL)" ]; then \
		echo ""; \
		echo "	Power model detected in default directory ($(GPGPUSIM_ROOT)/src/gpuwattch) but GPGPUSIM_POWER_MODEL not set."; \
		echo "	Please re-run setup_environment or manually set GPGPUSIM_POWER_MODEL to the gpuwattch directory if you would like to include the GPGPU-Sim Power Model."; \
		echo ""; \
		true; \
	elif [ ! -d "$(GPGPUSIM_POWER_MODEL)" ]; then \
		echo ""; \
		echo "ERROR ** Power model directory invalid."; \
		echo "($(GPGPUSIM_POWER_MODEL)) is not a valid directory."; \
		echo "Please set GPGPUSIM_POWER_MODEL to the GPGPU-Sim gpuwattch directory."; \
		echo ""; \
		exit 101; \
	elif [ -n "$(GPGPUSIM_POWER_MODEL)" -a ! -f "$(GPGPUSIM_POWER_MODEL)/gpgpu_sim.verify" ]; then \
		echo ""; \
		echo "ERROR ** Power model directory invalid."; \
		echo "gpgpu_sim.verify not found in $(GPGPUSIM_POWER_MODEL)."; \
		echo "Please ensure that GPGPUSIM_POWER_MODEL points to a valid gpuwattch directory and that you have the correct GPGPU-Sim mcpat distribution."; \
		echo ""; \
		exit 102; \
	fi
#下面，打印输出不支持 OpenCL 的信息；
no_opencl_support:
	@echo "Warning: gpgpu-sim is building without opencl support. Make sure NVOPENCL_LIBDIR and NVOPENCL_INCDIR are set"
#下面，目标 libcudart.so 的构建规则
$(SIM_LIB_DIR)/libcudart.so: makedirs $(LIBS) cudalib
	g++ -shared -Wl,-soname,libcudart.so \
			$(SIM_OBJ_FILES_DIR)/libcuda/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table/*.o \
			$(SIM_OBJ_FILES_DIR)/gpgpu-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/$(INTERSIM)/*.o \
			$(SIM_OBJ_FILES_DIR)/*.o -lm -lz -lGL -pthread \
			$(MCPAT) \
			-o $(SIM_LIB_DIR)/libcudart.so
	if [ ! -f $(SIM_LIB_DIR)/libcudart.so.2 ]; then ln -s libcudart.so $(SIM_LIB_DIR)/libcudart.so.2; fi
	if [ ! -f $(SIM_LIB_DIR)/libcudart.so.3 ]; then ln -s libcudart.so $(SIM_LIB_DIR)/libcudart.so.3; fi
	if [ ! -f $(SIM_LIB_DIR)/libcudart.so.4 ]; then ln -s libcudart.so $(SIM_LIB_DIR)/libcudart.so.4; fi
#下面，如果是在 Mac OS 中，目标 libcudart.dylib 的构建规则
$(SIM_LIB_DIR)/libcudart.dylib: makedirs $(LIBS) cudalib
	g++ -dynamiclib -Wl,-headerpad_max_install_names,-undefined,dynamic_lookup,-compatibility_version,1.1,-current_version,1.1\
			$(SIM_OBJ_FILES_DIR)/libcuda/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table/*.o \
			$(SIM_OBJ_FILES_DIR)/gpgpu-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/$(INTERSIM)/*.o  \
			$(SIM_OBJ_FILES_DIR)/*.o -lm -lz -pthread \
			$(MCPAT) \
			-o $(SIM_LIB_DIR)/libcudart.dylib
#下面，目标 libOpenCL.so 的构建规则
$(SIM_LIB_DIR)/libOpenCL.so: makedirs $(LIBS) opencllib
	g++ -shared -Wl,-soname,libOpenCL.so \
			$(SIM_OBJ_FILES_DIR)/libopencl/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table/*.o \
			$(SIM_OBJ_FILES_DIR)/gpgpu-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/$(INTERSIM)/*.o \
			$(SIM_OBJ_FILES_DIR)/*.o -lm -lz -lGL -pthread \
			$(MCPAT) \
			-o $(SIM_LIB_DIR)/libOpenCL.so 
	if [ ! -f $(SIM_LIB_DIR)/libOpenCL.so.1 ]; then ln -s libOpenCL.so $(SIM_LIB_DIR)/libOpenCL.so.1; fi
	if [ ! -f $(SIM_LIB_DIR)/libOpenCL.so.1.1 ]; then ln -s libOpenCL.so $(SIM_LIB_DIR)/libOpenCL.so.1.1; fi
#下面，目标cudalib 的构建规则
cudalib: makedirs cuda-sim
	$(MAKE) -C ./libcuda/ depend
	$(MAKE) -C ./libcuda/
#下面，目标 mcpat 的构建规则
ifneq ($(GPGPUSIM_POWER_MODEL),)
mcpat: makedirs
	$(MAKE) -C $(GPGPUSIM_POWER_MODEL) depend
	$(MAKE) -C $(GPGPUSIM_POWER_MODEL) $(MCPAT_DBG_FLAG)
endif
#下面，构建 cuda-sim 库
cuda-sim: makedirs
	$(MAKE) -C ./src/cuda-sim/ depend
	$(MAKE) -C ./src/cuda-sim/
#下面，构建 gpgpu-sim 核心库，以及依赖
gpgpu-sim_uarch: makedirs cuda-sim
	$(MAKE) -C ./src/gpgpu-sim/ depend
	$(MAKE) -C ./src/gpgpu-sim/
#下面，
$(INTERSIM): makedirs cuda-sim gpgpu-sim_uarch
	$(MAKE) "CREATE_LIBRARY=1" "DEBUG=$(DEBUG)" -C ./src/$(INTERSIM)
#下面，构建 gpgpu-sim 核心库及其相关库，以及依赖
gpgpusimlib: makedirs cuda-sim gpgpu-sim_uarch $(INTERSIM)
	$(MAKE) -C ./src/ depend
	$(MAKE) -C ./src/
#下面，构建 opencl 相关的 lib，及其依赖
opencllib: makedirs cuda-sim
	$(MAKE) -C ./libopencl/ depend
	$(MAKE) -C ./libopencl/
#下面，构建 cuobjdump_to_ptxplus，及其依赖
.PHONY: cuobjdump_to_ptxplus/cuobjdump_to_ptxplus
cuobjdump_to_ptxplus/cuobjdump_to_ptxplus: makedirs
	$(MAKE) -C ./cuobjdump_to_ptxplus/ depend
	$(MAKE) -C ./cuobjdump_to_ptxplus/
#下面，创建需要的文件夹
makedirs:
	if [ ! -d $(SIM_LIB_DIR) ]; then mkdir -p $(SIM_LIB_DIR); fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/libcuda ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/libcuda; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/cuda-sim ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/cuda-sim; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/gpgpu-sim ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/gpgpu-sim; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/libopencl ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/libopencl; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/libopencl/bin ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/libopencl/bin; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/$(INTERSIM) ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/$(INTERSIM); fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/cuobjdump_to_ptxplus ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/cuobjdump_to_ptxplus; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/gpuwattch ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/gpuwattch; fi;
	if [ ! -d $(SIM_OBJ_FILES_DIR)/gpuwattch/cacti ]; then mkdir -p $(SIM_OBJ_FILES_DIR)/gpuwattch/cacti; fi;
#下面，终极目标
all:
	$(MAKE) gpgpusim
#下面，构建文档
docs:
	$(MAKE) -C doc/doxygen/
#下面，清除文档
cleandocs:
	$(MAKE) clean -C doc/doxygen/
#下面，清除构建的目标和构建出来的文档
clean: makedirs
	$(MAKE) cleangpgpusim
#下面，清除构建的目标和中间文件
cleangpgpusim: cleandocs
	rm -rf $(SIM_LIB_DIR)
	rm -rf $(SIM_OBJ_FILES_DIR)

3.2 注释的 version_detection.mk

# Copyright (c) 2009-2011, Tor M. Aamodt
# Wilson W.L. Fung, Ali Bakhoda
# The University of British Columbia
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
# Redistributions in binary form must reproduce the above copyright notice, this
# list of conditions and the following disclaimer in the documentation and/or
# other materials provided with the distribution.
# Neither the name of The University of British Columbia nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

#下面，如果变量 GPGPUSIM_ROOT 不为空，则 从version文件中读取 GPGPUSIM_VERSION 的值；
# Detect GPGPU-Sim Version
ifeq ($(GPGPUSIM_ROOT),)
else
GPGPUSIM_VERSION=$(shell cat $(GPGPUSIM_ROOT)/version | awk '/Version/ {print $$8}' )
GPGPUSIM_BUILD=$(shell cat $(GPGPUSIM_ROOT)/version | awk '/Change/ {print $$6}' )
endif
#下面，通过 编译器 nvcc 来获取 cuda 的版本
# Detect CUDA Runtime Version 
CUDA_VERSION_STRING:=$(shell $(CUDA_INSTALL_PATH)/bin/nvcc --version | awk '/release/ {print $$5;}' | sed 's/,//')
CUDART_VERSION:=$(shell echo $(CUDA_VERSION_STRING) | sed 's/\./ /' | awk '{printf("%02u%02u", 10*int($$1), 10*$$2);}')
#下面，gcc的版本，编译器版本
# Detect GCC Version 
CC_VERSION := $(shell gcc --version | head -1 | awk '{for(i=1;i<=NF;i++){ if(match($$i,/^[0-9]\.[0-9]\.[0-9]$$/))  {print $$i; exit 0 }}}')
#下面，会影响类似 CFLAG， 向编译器指定需要遵循的标准，比如：c++03 和 c++11 (c++0x) 等；
# Detect Support for C++11 (C++0x) from GCC Version 
GNUC_CPP0X := $(shell gcc --version | perl -ne 'if (/gcc\s+\(.*\)\s+([0-9.]+)/){ if($$1 >= 4.3) {$$n=1} else {$$n=0;} } END { print $$n; }')

3.3 编译运行 vectorAdd

root@9c2982bd45f9:~/test_vectorAdd# cd ../NVIDIA_GPU_Computing_SDK/C/src/vectorAdd/
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# make
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# cd -
/root/test_vectorAdd
root@9c2982bd45f9:~/test_vectorAdd# cp ../gpgpu-sim_distribution/configs/GTX480/
config_fermi_islip.icnt  gpgpusim.config          gpuwattch_gtx480.xml
root@9c2982bd45f9:~/test_vectorAdd# cp ../gpgpu-sim_distribution/configs/GTX480/* ./
root@9c2982bd45f9:~/test_vectorAdd# cd ../NVIDIA_GPU_Computing_SDK/C/src/vectorAdd/
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# vim Makefile
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# vim ../../common/common.mk
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# ldd ../../bin/linux/release/vectorAdd
        linux-vdso.so.1 =>  (0x00007fff04be1000)
        libcudart.so.4 => /root/gpgpu-sim_distribution/lib/gcc-4.4.7/cuda-4000/release/libcudart.so.4 (0x00007f8745a00000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f87456fc000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f87453f6000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f87451e0000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8744e17000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f8744bfe000)
        libGL.so.1 => /usr/lib/x86_64-linux-gnu/mesa/libGL.so.1 (0x00007f8744998000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f874477a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f87460c0000)
        libglapi.so.0 => /usr/lib/x86_64-linux-gnu/libglapi.so.0 (0x00007f8744553000)
        libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f8744341000)
        libXdamage.so.1 => /usr/lib/x86_64-linux-gnu/libXdamage.so.1 (0x00007f874413e000)
        libXfixes.so.3 => /usr/lib/x86_64-linux-gnu/libXfixes.so.3 (0x00007f8743f38000)
        libX11-xcb.so.1 => /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1 (0x00007f8743d36000)
        libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f8743a01000)
        libxcb-glx.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-glx.so.0 (0x00007f87437ea000)
        libxcb-dri2.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-dri2.so.0 (0x00007f87435e5000)
        libxcb-dri3.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-dri3.so.0 (0x00007f87433e2000)
        libxcb-present.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-present.so.0 (0x00007f87431df000)
        libxcb-sync.so.1 => /usr/lib/x86_64-linux-gnu/libxcb-sync.so.1 (0x00007f8742fd9000)
        libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f8742dba000)
        libxshmfence.so.1 => /usr/lib/x86_64-linux-gnu/libxshmfence.so.1 (0x00007f8742bb8000)
        libXxf86vm.so.1 => /usr/lib/x86_64-linux-gnu/libXxf86vm.so.1 (0x00007f87429b2000)
        libdrm.so.2 => /usr/lib/x86_64-linux-gnu/libdrm.so.2 (0x00007f87427a4000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f87425a0000)
        libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f874239c000)
        libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f8742196000)
root@9c2982bd45f9:~/NVIDIA_GPU_Computing_SDK/C/src/vectorAdd# cd ~/test_vectorAdd/
root@9c2982bd45f9:~/test_vectorAdd# ../NVIDIA_GPU_Computing_SDK/C/bin/linux/release/vectorAdd


        *** GPGPU-Sim Simulator Version 3.2.2  [build 0] ***


GPGPU-Sim PTX: simulation mode 0 (can change with PTX_SIM_MODE_FUNC environment variable:
               1=functional simulation only, 0=detailed performance simulator)
GPGPU-Sim: Configuration options:

-network_mode                           1 # Interconnection network mode
-inter_config_file   config_fermi_islip.icnt # Interconnection network config file
-gpgpu_ptx_use_cuobjdump                    1 # Use cuobjdump to extract ptx and sass from binaries
-gpgpu_experimental_lib_support                    0 # Try to extract code from cuda libraries [Broken because of unknown cudaGetExportTable]
-gpgpu_ptx_convert_to_ptxplus                    0 # Convert SASS (native ISA) to ptxplus and run ptxplus
-gpgpu_ptx_force_max_capability                   20 # Force maximum compute capability
-gpgpu_ptx_inst_debug_to_file                    0 # Dump executed instructions' debug information to file
-gpgpu_ptx_inst_debug_file       inst_debug.txt # Executed instructions' debug output file
-gpgpu_ptx_inst_debug_thread_uid                    1 # Thread UID for executed instructions' debug output
-gpgpu_simd_model                       1 # 1 = post-dominator
-gpgpu_shader_core_pipeline              1536:32 # shader core pipeline config, i.e., {<nthread>:<warpsize>}
-gpgpu_tex_cache:l1  4:128:24,L:R:m:N,F:128:4,128:2 # per-shader L1 texture cache  (READ-ONLY) config  {<nsets>:<bsize>:<assoc>,<rep>:<wr>:<alloc>:<wr_alloc>,<mshr>:<N>:<merge>,<mq>:<rf>}
-gpgpu_const_cache:l1 64:64:2,L:R:f:N,A:2:32,4 # per-shader L1 constant memory cache  (READ-ONLY) config  {<nsets>:<bsize>:<assoc>,<rep>:<wr>:<alloc>:<wr_alloc>,<mshr>:<N>:<merge>,<mq>}
-gpgpu_cache:il1     4:128:4,L:R:f:N,A:2:32,4 # shader L1 instruction cache config  {<nsets>:<bsize>:<assoc>,<rep>:<wr>:<alloc>:<wr_alloc>,<mshr>:<N>:<merge>,<mq>}

ldd vecotrAdd 查看依赖：

root@9c2982bd45f9:~/test_vectorAdd# ldd ../NVIDIA_GPU_Computing_SDK/C/bin/linux/release/vectorAdd
        linux-vdso.so.1 =>  (0x00007fff8b9f7000)
        libcudart.so.4 => /root/gpgpu-sim_distribution/lib/gcc-4.4.7/cuda-4000/release/libcudart.so.4
...

我们发现

vectorAdd 依赖于 libcudart.so.4,

而 libcudart.so.4 是 libcudart.so 的软连接；

3.4 libcudart.so 的构建

从顶层Makefile 可知， libcudart.so 是如下这样构建出来的：

	g++ -shared -Wl,-soname,libcudart.so \
			$(SIM_OBJ_FILES_DIR)/libcuda/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/cuda-sim/decuda_pred_table/*.o \
			$(SIM_OBJ_FILES_DIR)/gpgpu-sim/*.o \
			$(SIM_OBJ_FILES_DIR)/$(INTERSIM)/*.o \
			$(SIM_OBJ_FILES_DIR)/*.o -lm -lz -lGL -pthread \
			$(MCPAT) \
			-o $(SIM_LIB_DIR)/libcudart.so

依赖于 LIBS = cuda-sim gpgpu-sim_uarch $(INTERSIM) gpgpusimlib 等项目

3.5 接下来如何分析代码呢

由 3.4 可知，既然 libcudart.so 打包了几个文件夹下的 .o 文件，也就是说，gpgpu-sim 模拟gpu 的代码，也被包含其中；

那么要尝试通过跟踪 cuda runtime api 的逻辑，来查看gpgpu-sim 对 gpu 行为的模拟方式。

通过跟踪 cudaMalloc 来查看gpu 显存分配；

通过跟踪cudaMemcpy 来梳理 gpu 数据搬运过程；

通过跟踪 vectorAdd的kernel 来跟踪 gpu launch 一个 kernel 的过程；

通过葛总 vectorAdd 内部的运算实现，来跟踪 gpu 调度和运行warp 的方式；

并总结电源模块的工作方式；

NVIDIA AI技术专区

NVIDIA官方入驻，分享最新的官方资源以及活动/会议信息，精选收录AI相关技术内容，欢迎大家加入社区并参与讨论。

更多推荐

16_Vue3动画(一)之动画的基本使用及animate.css库的使用

Vue3动画的基本使用及animate.css库的使用认识动画在开发中，我们想要给一个组件的显示和消失添加某种过渡动画，可以很好的增加用户体验：React框架本身并没有提供任何动画相关的API，所以在React中使用过渡动画我们需要使用一个第三方库react-transition-group；Vue中为我们提供一些内置组件和对应的API来完成动画，利用它们我们可以方便的实现过渡动画效果；我们来看一