13.3 使用OpenMP并行化交叉编译Windows二进制文件
NOTE:此示例代码可以在 https://github.com/dev-cafe/cmake-cookbook/tree/v1.0/chapter-13/recipe-02 中找到,其中包含一个C++示例和Fortran示例。该示例在CMake 3.5版(或更高版本)中是有效的,并且已经在GNU/Linux、macOS和Windows上进行过测试。
在这个示例中,我们将交叉编译一个OpenMP并行化的Windows二进制文件。
准备工作
我们将使用第3章第5节中的未修改的源代码,示例代码将所有自然数加到N (example.cpp):
#include <iostream>#include <omp.h>#include <string>int main(int argc, char *argv[]) {std::cout << "number of available processors: " << omp_get_num_procs()<< std::endl;std::cout << "number of threads: " << omp_get_max_threads() << std::endl;auto n = std::stol(argv[1]);std::cout << "we will form sum of numbers from 1 to " << n << std::endl;// start timerauto t0 = omp_get_wtime();auto s = 0LL;#pragma omp parallel for reduction(+ : s)for (auto i = 1; i <= n; i++) {s += i;}// stop timerauto t1 = omp_get_wtime();std::cout << "sum: " << s << std::endl;std::cout << "elapsed wall clock time: " << t1 - t0 << " seconds" << std::endl;return 0;}
CMakeLists.txt检测OpenMP并行环境方面基本没有变化,除了有一个额外的安装目标:
# set minimum cmake versioncmake_minimum_required(VERSION 3.9 FATAL_ERROR)# project name and languageproject(recipe-02 LANGUAGES CXX)set(CMAKE_CXX_STANDARD 11)set(CMAKE_CXX_EXTENSIONS OFF)set(CMAKE_CXX_STANDARD_REQUIRED ON)include(GNUInstallDirs)set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY${CMAKE_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR})set(CMAKE_LIBRARY_OUTPUT_DIRECTORY${CMAKE_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR})set(CMAKE_RUNTIME_OUTPUT_DIRECTORY${CMAKE_BINARY_DIR}/${CMAKE_INSTALL_BINDIR})find_package(OpenMP REQUIRED)add_executable(example example.cpp)target_link_libraries(examplePUBLICOpenMP::OpenMP_CXX)install(TARGETSexampleDESTINATION${CMAKE_INSTALL_BINDIR})
具体实施
通过以下步骤,我们将设法交叉编译一个OpenMP并行化的Windows可执行文件:
创建一个包含
example.cpp和CMakeLists.txt的目录。我们将使用与之前例子相同的
toolchain.cmake:# the name of the target operating systemset(CMAKE_SYSTEM_NAME Windows)# which compilers to useset(CMAKE_CXX_COMPILER i686-w64-mingw32-g++)# adjust the default behaviour of the find commands:# search headers and libraries in the target environmentset(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)# search programs in the host environmentset(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
将
CMAKE_CXX_COMPILER设置为对应的编译器(路径)。然后,通过
CMAKE_TOOLCHAIN_FILE指向工具链文件来配置代码(本例中,使用了从源代码构建的MXE编译器):$ mkdir -p build$ cd build$ cmake -D CMAKE_TOOLCHAIN_FILE=toolchain.cmake ..-- The CXX compiler identification is GNU 5.4.0-- Check for working CXX compiler: /home/user/mxe/usr/bin/i686-w64-mingw32.static-g++-- Check for working CXX compiler: /home/user/mxe/usr/bin/i686-w64-mingw32.static-g++ -- works-- Detecting CXX compiler ABI info-- Detecting CXX compiler ABI info - done-- Detecting CXX compile features-- Detecting CXX compile features - done-- Found OpenMP_CXX: -fopenmp (found version "4.0")-- Found OpenMP: TRUE (found version "4.0")-- Configuring done-- Generating done-- Build files have been written to: /home/user/cmake-recipes/chapter-13/recipe-02/cxx-example/build
构建可执行文件:
$ cmake --build .Scanning dependencies of target example[ 50%] Building CXX object CMakeFiles/example.dir/example.cpp.obj[100%] Linking CXX executable bin/example.exe[100%] Built target example
将
example.exe拷贝到Windows环境下。Windows环境下,将看到如下的输出:
$ set OMP_NUM_THREADS=1$ example.exe 1000000000number of available processors: 2number of threads: 1we will form sum of numbers from 1 to 1000000000sum: 500000000500000000elapsed wall clock time: 2.641 seconds$ set OMP_NUM_THREADS=2$ example.exe 1000000000number of available processors: 2number of threads: 2we will form sum of numbers from 1 to 1000000000sum: 500000000500000000elapsed wall clock time: 1.328 seconds
正如我们所看到的,二进制文件可以在Windows上工作,而且由于OpenMP并行化,我们可以观察到加速效果!
工作原理
我们已经成功地使用一个简单的工具链进行交叉编译了一个可执行文件,并可以在Windows平台上并行执行。我们可以通过设置OMP_NUM_THREADS来指定OpenMP线程的数量。从一个线程到两个线程,我们观察到运行时从2.6秒减少到1.3秒。有关工具链文件的讨论,请参阅前面的示例。
更多信息
可以交叉编译一组目标平台(例如:Android),可以参考:https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html
