This is the third post in my Cortex‑M7 without hardware series.

Parts 1 and 2 already explained the two pieces that differs from standard C/C++ -projects: the linker script and startup. So this part is building for embedded target also known as “get an ELF”.

TL;DR

On following headers I will go through cross compiler, toolchain file and CMakeLists.txt for this bare-metal project. Here are quick steps on how to get project with linker script and startup.c and get it compiled.

# Clone the minimal branch of project
git clone \
--branch minimal --single-branch \
https://gitlab.com/sorhanp/arm.git \
arm-minimal

# Enter the project directory
cd arm-minimal

# Configure using toolchain and build folder
cmake -DCMAKE_TOOLCHAIN_FILE=toolchain/arm-none-eabi.cmake \
-B build

# Build
cmake --build build/

After build has been completed, the ELF is placed to folder called build. On next part I will go through what that folder contains. Keep on reading to understand what just happened.

Cross compiler

As the the target is for completely different architecture (Cortex-M7 is 32-bit Reduced Instruction Set Computer(RISC)) it requires a compiler that can generate code for that platform. These compilers are known as cross compilers. 1

For this purpose the standard host compiler, such as GCC or Clang, will not suffice, but rather Arm GNU Toolchain is needed. However, there are multiple toolchains, such as AArch32 bare-metal target (arm-none-eabi), AArch32 GNU/Linux target with hard float (arm-none-linux-gnueabihf), AArch64 bare-metal target (aarch64-none-elf), AArch64 GNU/Linux target (aarch64-none-linux-gnu) and AArch64 GNU/Linux big-endian target (aarch64_be-none-linux-gnu). What do all of these mean and what compiler is needed in this case? 2

Tool chains have a loose name convention that follows this arch-[vendor]-[os]-eabi (where [] indicate non-mandatory), where:

  • arch - refers to target architecture
  • vendor - refers to toolchain supplier
  • os - refers to the target operating system
  • eabi - refers to Embedded ABI (Application Binary Interface) 3

One cross compiler on above list is arm-none-linux-gnueabihf, which means that target architecture is ARM, with none as toolchain supplier, target operating system being Linux and uses GNU EABI hard-float for Linux. Since Cortex-M7 is a “bare-metal” target, meaning there is no operating system, the correct compiler is arm-none-eabi (ARM architecture, no operating system, Embedded ABI).

This compiler can be found on many Linux distributions, i.e. Debian (and it’s derivatives such as Ubuntu) and Fedora, with name gcc-arm-none-eabi 4 5

Alternatively it be downloaded directly from ARM’s toolchain downloads. 6

CMake’s toolchain files

In order to use CMake for cross-compiling, a CMake file that describes the target platform has to be created, called the “toolchain file 1

Here is a toolchain file called gcc-arm-none-eabi.cmake that does just that along with comments:

# Build generic ELF file for ARM processor
set(CMAKE_SYSTEM_NAME Generic-ELF)
set(CMAKE_SYSTEM_PROCESSOR arm)

# Skip the compiler test as it won't pass with arm compiler
set(CMAKE_TRY_COMPILE_TARGET_TYPE "STATIC_LIBRARY")

# Add suffix if on Windows system
if(WIN32)
  set(TOOLCHAIN_EXECUTABLE_SUFFIX ".exe")
else()
  set(TOOLCHAIN_EXECUTABLE_SUFFIX "")
endif(WIN32)

# Set toolchain. If BAREMETAL_ARM_TOOLCHAIN_PATH is not set, toolchain must be found from PATH 
set(CMAKE_AR
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-ar${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_ASM_COMPILER
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-gcc${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_C_COMPILER
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-gcc${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_CXX_COMPILER
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-g++${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_LINKER
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-ld${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_OBJCOPY
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-objcopy${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_OBJDUMP
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-objdump${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_RANLIB
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-ranlib${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_SIZE
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-size${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")
set(CMAKE_STRIP
    ${BAREMETAL_ARM_TOOLCHAIN_PATH}arm-none-eabi-strip${TOOLCHAIN_EXECUTABLE_SUFFIX}
    CACHE INTERNAL "")

# Adjust the default behavior of the CMake's FIND_X() commands:
# search programs in the host environment
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER) 
# search headers and libraries in the target environment
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY) 
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)

# Set standards
set(CMAKE_C_STANDARD 23)
set(CMAKE_C_STANDARD_REQUIRED ON)
set(CMAKE_C_EXTENSIONS OFF)

set(CMAKE_CXX_STANDARD 23)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

Most important part here is skipping of compiler test (setting of CMAKE_TRY_COMPILE_TARGET_TYPE to “STATIC_LIBRARY), as otherwise CMake will produce following error:

CMake Error at /usr/share/cmake-4.2/Modules/CMakeTestCXXCompiler.cmake:73 (message):
  The C++ compiler

    "/usr/bin/arm-none-eabi-g++"

  is not able to compile a simple test program.

Also toolchain is used to set the C/C++ standards, I use 23 for both as my toolchain supports it. It can be reduced to C++20/C++17 or C99 depending on toolchain”

CMakeLists.txt

While toolchain file will set the compiler there are plenty of flags that needs to be set as seen here:

cmake_minimum_required(VERSION 3.31)
include(FetchContent)

if(NOT CMAKE_TOOLCHAIN_FILE)
  message(FATAL_ERROR "CMAKE_TOOLCHAIN_FILE must be set before configuring the project.")
endif()

# Set the project name and language
project(
  arm
  VERSION 0.1.0
  LANGUAGES CXX C)

# Create executable
add_executable(${PROJECT_NAME} arm/main.cpp arm/startup.c)

# Link to other libraries (if needed)
target_link_libraries(
  ${PROJECT_NAME}
  PUBLIC
  PRIVATE)

if(CMAKE_CROSSCOMPILING)
  set(CPU_FEATURE_FLAGS
      "-mthumb" # Sets the Thumb instruction set, which generates 16‑bit/32‑bit mixed‑length instructions instead of the 32‑bit ARM encoding for smaller binary.
      "-mcpu=cortex-m7" # Specifies the exact target, allowing further compiler optimization.
  )

  # Generic size‑optimisation flags (that apply to both C and C++)
  set(_SIZE_GENERIC_FLAGS
      "-fno-common" # Disable “common” symbols, for speed boost and detection of accidental multiple definitions early.
      "-ffreestanding" # Set this environment as non-hosted == no full standard library guaranteed.
      "-fdata-sections" # Place each data object in its own section.
      "-ffunction-sections" # Place each function object in its own section.
      "-fmerge-constants" # Merge identical constant objects into a single read‑only section.
  )

  # C++ only size‑optimisation flags
  set(_SIZE_CXX_FLAGS
    "-fno-exceptions" # Disables C++ exception handling by removing throw, try/catch, and the unwind tables.
    "-fno-rtti" # Disables RTTI ("Run‑Time Type Information") by removing v‑table RTTI structures.
    "-fno-use-cxa-atexit" # Prevent the use of __cxa_atexit for global/static destructor registration. 
    "-fno-threadsafe-statics" # Disables the thread‑safe guard, as this environment is single threaded.
  )

  # This variable will be used in target_compile_options()
  set(OPTIMIZE_FOR_SIZE
      ${_SIZE_GENERIC_FLAGS}
      $<$<COMPILE_LANGUAGE:CXX>:${_SIZE_CXX_FLAGS}>
  )

  # Set compiler flags
  target_compile_options(${PROJECT_NAME} PRIVATE ${CPU_FEATURE_FLAGS} ${OPTIMIZE_FOR_SIZE})

  # Set linker flags
  target_link_options(
    ${PROJECT_NAME}
    PRIVATE
    "-T${CMAKE_SOURCE_DIR}/arm/memory.ld" # Use the linker script as the memory layout description.
    "-nostartfiles" # Do not link the default startup files, use custom vector table.
    "LINKER:-static" # Produces a fully static executable, no shared libraries as there is no dynamic loader.
    "LINKER:--gc-sections" # Remove unused sections provided by -fdata-sections / -ffunction-sections.
    "LINKER:--sort-section=alignment" # Sort sections by alignment, which improves memory packing.
    "LINKER:-Map=${PROJECT_NAME}.map" # Generate a "map file" that shows section sizes, address assignments, and symbol locations.
    "LINKER:--cref" # Generate a cross‑reference table of symbols in the map file.
    "LINKER:--print-memory-usage" # Prints a quick view of how much flash and RAM the final image consumes.
  )
  
endif()

if(PROJECT_IS_TOP_LEVEL)
  list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake")

  FetchContent_Declare(
    _cmake
    GIT_REPOSITORY https://gitlab.com/sorhanp/cmake.git
    GIT_TAG master # Always fetch the latest for possible changes in warnings etc.
    GIT_SHALLOW TRUE # Do not fetch full history, makes fetch faster
    UPDATE_DISCONNECTED TRUE # Do not contact the network, if the dependency is already present in the build tree
  )
  FetchContent_MakeAvailable(_cmake)

  list(APPEND CMAKE_MODULE_PATH "${_cmake_SOURCE_DIR}")
  include(CompilerWarnings)
  include(PreventInSourceBuilds)
  include(StandardProjectSettings)

  # Create interface to use the warnings specified in CompilerWarnings.cmake
  add_library(project_warnings INTERFACE)
  set(WARNINGS_AS_ERRORS ON)
  set_project_warnings(project_warnings ${WARNINGS_AS_ERRORS})
endif()

if(TARGET project_warnings)
  target_link_libraries(${PROJECT_NAME} PRIVATE project_warnings)
endif()

While the first lines are very standard CMake stuff where project is created along with executable target, there is plenty of new stuff to explore on CMAKE_CROSSCOMPILING branch. Here are some of the key concepts under their own headings.

Compiler flags

Arm compilers have plenty of their own flags to set7, along with myriad of flags that come from GNU GCC compiler8, that can be used to control C/C++ dialect, optimization and so on.

CPU_FEATURE_FLAGS

Here the thumb instruction set (as most arm processors support two instruction sets) is used to generate 16‑bit/32‑bit mixed‑length instructions instead of the 32‑bit ARM encoding, which makes resulting output file (ELF) smaller, along with performance optimization.

Finally mcpu is used to the set exact core that code is generated for, as there is no default for arm-none-eabi compiler. This also allow performance optimization related to that core to be utilized by the compiler.

OPTIMIZE_FOR_SIZE

Here two variables are combined to one, one variable is generic that applies to both C and C++ sources (_SIZE_GENERIC_FLAGS) and one that applies only the C++ sources (_SIZE_CXX_FLAGS). These flags are designed to make the ELF smaller, by omitting things that are might not be needed or available on bare-metal targets such as internal exceptions (no-exceptions) and thread‑safety (no-threadsafe-statics).

Two important compiler options are data-sections and function-sections that are passed to compiler with -fflag -syntax. These flags ensure that instead of putting all of the data and functions to .text section, text. and .data. section are created. This allows linker to discard (garbage collect) unused sections more effectively (see next heading for linker flags that allow this to happen).

Another interesting topic here to discuss is freestanding flag, that stands for “freestanding environment”. This means that compiler assumes that full standard library is not guaranteed, and only minimal subset of the standard might be available. It also modifies the requirements for global main function 9.

On hosted environments program must contain a main for startup, where as in freestanding environments startup is defined by implementation (like seen on part 2: startup). This means that even though in this bare-metal project there is going to be main (again as seen on part 2, where extern int main(void); is defined on startup.c) it is completely optional have it, it could have any arbitrary name.

Also on hosted environments it is assumed that every standard header is available, such as #include <array> 10, but on freestanding it is not required, thus any such includes will generate compile time error in cases where it is omitted. For maximum portability, the required headers for freestanding are listed under “Requirements on standard library headers” heading. Do note that some of them are made available on later standard versions, for example on C++23 that is used by this project #include <string> is not available. 9

Linker flags

As with compiler flags linkers have their own flags. 11

Linker flags are passed to target with CMake’s target_link_options function. As many linker have their own syntax and system specific options, such as –Wl,–flag or -Xlinker option. CMake provides nifty keyword LINKER: that allows to do this in portable manner, without having to manually set different flags per linker.12

Must haves

  • -T script is used to pass the linker script (see part 1 for explanation what it contains) so that correct memory model is used and sections are placed correctly
  • -nostartfiles along with freestanding this makes startup.c (see part 2) “work” as it forces standard system startup files to be omitted and own code to provided
  • LINKER:-static guarantees that all code resides in the ELF, so no shared libraries are linked against as there is no way to load them on bare-metal target
  • LINKER:--gc-sections removes (garbage collects) unused text. and .data. that were generated by compiler
  • LINKER:--sort-section=alignment sorts each output section by alignment (name is another option here), making ELF more compact. Note that this is strictly GNU ld flag, thus it is not fully portable.

Nice to haves

LINKER:-Map=${PROJECT_NAME}.map and LINKER:--cref generate a “map file” that shows section sizes, address assignments, and symbol locations and adds a “cref” (cross‑reference) table of symbols to said file. Like mentioned previously more on map and ELF files on next part.

LINKER:--print-memory-usage prints a quick view at the end of linking of how much flash and RAM the ELF consumes. Note that this is strictly GNU ld flag, thus it is not fully portable. However where supported it looks like this:

Memory region         Used Size  Region Size  %age Used
           FLASH:          54 B         1 MB      0.01%
             RAM:           0 B       320 KB      0.00%

Next

In Part 4 I’ll explain what the build folder contains.

References