2.5 检测处理器指令集

NOTE:此示例代码可以在 https://github.com/dev-cafe/cmake-cookbook/tree/v1.0/chapter-02/recipe-05 中找到,包含一个C++示例。该示例在CMake 3.10版(或更高版本)中是有效的,并且已经在GNU/Linux、macOS和Windows上进行过测试。

本示例中,我们将讨论如何在CMake的帮助下检测主机处理器支持的指令集。这个功能是较新版本添加到CMake中的,需要CMake 3.10或更高版本。检测到的主机系统信息,可用于设置相应的编译器标志,或实现可选的源代码编译,或根据主机系统生成源代码。本示例中,我们的目标是检测主机系统信息,使用预处理器定义将其传递给C++源代码,并将信息打印到输出中。

准备工作

我们是C++源码(processor-info.cpp)如下所示:

  1. #include "config.h"
  2. #include <cstdlib>
  3. #include <iostream>
  4. int main()
  5. {
  6. std::cout << "Number of logical cores: "
  7. << NUMBER_OF_LOGICAL_CORES << std::endl;
  8. std::cout << "Number of physical cores: "
  9. << NUMBER_OF_PHYSICAL_CORES << std::endl;
  10. std::cout << "Total virtual memory in megabytes: "
  11. << TOTAL_VIRTUAL_MEMORY << std::endl;
  12. std::cout << "Available virtual memory in megabytes: "
  13. << AVAILABLE_VIRTUAL_MEMORY << std::endl;
  14. std::cout << "Total physical memory in megabytes: "
  15. << TOTAL_PHYSICAL_MEMORY << std::endl;
  16. std::cout << "Available physical memory in megabytes: "
  17. << AVAILABLE_PHYSICAL_MEMORY << std::endl;
  18. std::cout << "Processor is 64Bit: "
  19. << IS_64BIT << std::endl;
  20. std::cout << "Processor has floating point unit: "
  21. << HAS_FPU << std::endl;
  22. std::cout << "Processor supports MMX instructions: "
  23. << HAS_MMX << std::endl;
  24. std::cout << "Processor supports Ext. MMX instructions: "
  25. << HAS_MMX_PLUS << std::endl;
  26. std::cout << "Processor supports SSE instructions: "
  27. << HAS_SSE << std::endl;
  28. std::cout << "Processor supports SSE2 instructions: "
  29. << HAS_SSE2 << std::endl;
  30. std::cout << "Processor supports SSE FP instructions: "
  31. << HAS_SSE_FP << std::endl;
  32. std::cout << "Processor supports SSE MMX instructions: "
  33. << HAS_SSE_MMX << std::endl;
  34. std::cout << "Processor supports 3DNow instructions: "
  35. << HAS_AMD_3DNOW << std::endl;
  36. std::cout << "Processor supports 3DNow+ instructions: "
  37. << HAS_AMD_3DNOW_PLUS << std::endl;
  38. std::cout << "IA64 processor emulating x86 : "
  39. << HAS_IA64 << std::endl;
  40. std::cout << "OS name: "
  41. << OS_NAME << std::endl;
  42. std::cout << "OS sub-type: "
  43. << OS_RELEASE << std::endl;
  44. std::cout << "OS build ID: "
  45. << OS_VERSION << std::endl;
  46. std::cout << "OS platform: "
  47. << OS_PLATFORM << std::endl;
  48. return EXIT_SUCCESS;
  49. }

其包含config.h头文件,我们将使用config.h.in生成这个文件。config.h.in如下:

  1. #pragma once
  2. #define NUMBER_OF_LOGICAL_CORES @_NUMBER_OF_LOGICAL_CORES@
  3. #define NUMBER_OF_PHYSICAL_CORES @_NUMBER_OF_PHYSICAL_CORES@
  4. #define TOTAL_VIRTUAL_MEMORY @_TOTAL_VIRTUAL_MEMORY@
  5. #define AVAILABLE_VIRTUAL_MEMORY @_AVAILABLE_VIRTUAL_MEMORY@
  6. #define TOTAL_PHYSICAL_MEMORY @_TOTAL_PHYSICAL_MEMORY@
  7. #define AVAILABLE_PHYSICAL_MEMORY @_AVAILABLE_PHYSICAL_MEMORY@
  8. #define IS_64BIT @_IS_64BIT@
  9. #define HAS_FPU @_HAS_FPU@
  10. #define HAS_MMX @_HAS_MMX@
  11. #define HAS_MMX_PLUS @_HAS_MMX_PLUS@
  12. #define HAS_SSE @_HAS_SSE@
  13. #define HAS_SSE2 @_HAS_SSE2@
  14. #define HAS_SSE_FP @_HAS_SSE_FP@
  15. #define HAS_SSE_MMX @_HAS_SSE_MMX@
  16. #define HAS_AMD_3DNOW @_HAS_AMD_3DNOW@
  17. #define HAS_AMD_3DNOW_PLUS @_HAS_AMD_3DNOW_PLUS@
  18. #define HAS_IA64 @_HAS_IA64@
  19. #define OS_NAME "@_OS_NAME@"
  20. #define OS_RELEASE "@_OS_RELEASE@"
  21. #define OS_VERSION "@_OS_VERSION@"
  22. #define OS_PLATFORM "@_OS_PLATFORM@"

如何实施

我们将使用CMake为平台填充config.h中的定义,并将示例源文件编译为可执行文件:

  1. 首先,我们定义了CMake最低版本、项目名称和项目语言:

    1. cmake_minimum_required(VERSION 3.10 FATAL_ERROR)
    2. project(recipe-05 CXX)
  2. 然后,定义目标可执行文件及其源文件,并包括目录:

    1. add_executable(processor-info "")
    2. target_sources(processor-info
    3. PRIVATE
    4. processor-info.cpp
    5. )
    6. target_include_directories(processor-info
    7. PRIVATE
    8. ${PROJECT_BINARY_DIR}
    9. )
  3. 继续查询主机系统的信息,获取一些关键字:

    1. foreach(key
    2. IN ITEMS
    3. NUMBER_OF_LOGICAL_CORES
    4. NUMBER_OF_PHYSICAL_CORES
    5. TOTAL_VIRTUAL_MEMORY
    6. AVAILABLE_VIRTUAL_MEMORY
    7. TOTAL_PHYSICAL_MEMORY
    8. AVAILABLE_PHYSICAL_MEMORY
    9. IS_64BIT
    10. HAS_FPU
    11. HAS_MMX
    12. HAS_MMX_PLUS
    13. HAS_SSE
    14. HAS_SSE2
    15. HAS_SSE_FP
    16. HAS_SSE_MMX
    17. HAS_AMD_3DNOW
    18. HAS_AMD_3DNOW_PLUS
    19. HAS_IA64
    20. OS_NAME
    21. OS_RELEASE
    22. OS_VERSION
    23. OS_PLATFORM
    24. )
    25. cmake_host_system_information(RESULT _${key} QUERY ${key})
    26. endforeach()
  4. 定义了相应的变量后,配置config.h:

    1. configure_file(config.h.in config.h @ONLY)
  5. 现在准备好配置、构建和测试项目:

    1. $ mkdir -p build
    2. $ cd build
    3. $ cmake ..
    4. $ cmake --build .
    5. $ ./processor-info
    6. Number of logical cores: 4
    7. Number of physical cores: 2
    8. Total virtual memory in megabytes: 15258
    9. Available virtual memory in megabytes: 14678
    10. Total physical memory in megabytes: 7858
    11. Available physical memory in megabytes: 4072
    12. Processor is 64Bit: 1
    13. Processor has floating point unit: 1
    14. Processor supports MMX instructions: 1
    15. Processor supports Ext. MMX instructions: 0
    16. Processor supports SSE instructions: 1
    17. Processor supports SSE2 instructions: 1
    18. Processor supports SSE FP instructions: 0
    19. Processor supports SSE MMX instructions: 0
    20. Processor supports 3DNow instructions: 0
    21. Processor supports 3DNow+ instructions: 0
    22. IA64 processor emulating x86 : 0
    23. OS name: Linux
    24. OS sub-type: 4.16.7-1-ARCH
    25. OS build ID: #1 SMP PREEMPT Wed May 2 21:12:36 UTC 2018
    26. OS platform: x86_64
  6. 输出会随着处理器的不同而变化。

工作原理

CMakeLists.txt中的foreach循环会查询多个键值,并定义相应的变量。此示例的核心函数是cmake_host_system_information,它查询运行CMake的主机系统的系统信息。本例中,我们对每个键使用了一个函数调用。然后,使用这些变量来配置config.h.in中的占位符,输入并生成config.h。此配置使用configure_file命令完成。最后,config.h包含在processor-info.cpp中。编译后,它将把值打印到屏幕上。我们将在第5章(配置时和构建时操作)和第6章(生成源代码)中重新讨论这种方法。

更多信息

对于更细粒度的处理器指令集检测,请考虑以下模块: https://github.com/VcDevel/Vc/blob/master/cmake/OptimizeForArchitecture.cmake 。有时候,构建代码的主机可能与运行代码的主机不一样。在计算集群中,登录节点的体系结构可能与计算节点上的体系结构不同。解决此问题的一种方法是,将配置和编译作为计算步骤,提交并部署到相应计算节点上。