HPE Cray Clang C and C++ Quick Reference

HPE Cray Compiling Environment (CCE) provides Fortran, C, and C++ compilers for HPE Cray systems. The HPE Cray Clang C and C++ Quick Reference includes basic reference information for the Cray Clang C and C++ compilers that are included in CCE. This guide is intended for users and application programmers.

Additional Information Resources

Online help is available after the CCE module is loaded through:

man clang - Returns the Clang man page.
man craycc or man crayCC - Redirects you to the Clang man page. (Note that craycc and crayCC man pages used in earlier versions of CCE are replaced by aliases.)
clang --help - Returns a summary of the command line options and arguments. Because this list is lengthy, clang --help \| may be more useful.

The man page is presumed to be more current if content differences exist between this guide and the clang man page. Note also that the complete Clang reference manual is included in HTML format in the /opt/cray/pe/cce/<version>.0.0/doc/html/index.html filesystem location.

Typographic Conventions

This style indicates program code, reserved words, library functions, command-line prompts, screen output, file/path names, variables, and other software constructs.
\ (backslash) at the end of a command line indicates the Linux shell line continuation character (lines joined by a backslash are parsed as a single line).

Introduction to CCE Clang

HPE Cray Compiling Environment (CCE) Clang supports compiling the C, C++, and UPC languages and the OpenMP parallel programming model for targets available on supported systems. Using this compiler for other languages, models, or targets is not supported; any documentation related to such features is provided as-is for reference purposes only.

Invoking Clang

The CCE Clang C and C++ compilers should be invoked using cc and CC as usual. This method sets the target, based on the loaded craype-arch module and link with the usual HPE Cray libraries, including HPE Cray-optimized math functions, memcpy, and OpenMP runtime. Use of the native clang or clang++ commands is discouraged, as doing so may not find necessary paths and will not link automatically with Cray libraries.

To invoke Clang:

For C programs

cc [options] <filename> ...
For C++ programs

CC [options] <filename> ...
For UPC programs

cc -hupc [options] <filename> ...
For HIP programs

CC [options] -x hip <filename> ...

Compilation Stages

clang is a C, C++, and Objective-C compiler that encompasses preprocessing, parsing, optimization, code generation, assembly, and linking. Depending on which high-level mode setting is passed, Clang will stop before doing a full link. While Clang is highly integrated, it is important to understand the stages of compilation, to understand how to invoke it. These stages are:

Driver

The clang executable is actually a small driver that controls the overall execution of other tools, such as the compiler, assembler, and linker. Typically, you do not need to interact with the driver, but you transparently use it to run the other tools.
Preprocessing

This stage handles the tokenization of the input source file, macro expansion, #include expansion, and handling of other preprocessor directives. The output of this stage is typically called a .i (for C), .ii (for C++), .mi (for Objective-C), or .mii (for Objective-C++) file.
Parsing and Semantic Analysis

This stage parses the input file, translating preprocessor tokens into a parse tree. When in the form of a parse tree, it applies semantic analysis to compute types for expressions as well as determining whether the code is well formed. This stage is responsible for generating most of the compiler warnings as well as parse errors. The output of this stage is an Abstract Syntax Tree (AST).
Code Generation and Optimization

This stage translates an AST into low-level intermediate representation (known as “LLVM IR”) and ultimately to machine code. This phase is responsible for optimizing the generated code and handling target-specific code generation. The output of this stage is typically called a .s file or assembly file.

Clang also supports the use of an integrated assembler, where the code generator produces object files directly. This operation avoids the overhead of generating the .s file and calling the target assembler.
Assembler

This stage runs the target assembler to translate the output of the compiler into a target object file. The output of this stage is typically called a .o file or object file.
Linker

This stage runs the target linker to merge multiple object files into an executable or dynamic library. The output of this stage is typically called an a.out, .dylib or .so file.
Static Analyzer

The Clang Static Analyzer is a tool that scans source code to try to find bugs through code analysis. This tool uses many parts of Clang and is built into the same driver. See the Clang Static Analyzer website for more details on how to use the static analyzer.

General Enhancements

Clang/LLVM provides improved performance of generated code and includes additional features. In general, performance improvements is enabled by default at appropriate optimization levels. (Features must be requested by an option.) The compiler predefines the __cray__ macro in addition to usual Clang predefined macros.

-fcray, -fno-cray

Select the compiler’s default behavior, which provides the basis for customization by other options. The default is -fcray, which enables Cray enhancements, whereas -fno-cray disables Cray enhancements. The last instance of -fcray and -fno-cray applies. The position of -fcray or -fno-cray relative to other options does not matter. For example, with -fcray, other options that disable specific Cray enhancements are honored, and with -fno-cray, other options that enable specific Cray enhancements are honored.

Note that -fno-cray is intended to help diagnose whether a problem is caused by a Cray enhancement or is present in the base Clang/LLVM distribution. Either way, the problem should be reported to Cray to receive the fastest response.
-fenhanced-asm=<verbosity>

Emit descriptive comments in assembly code output. The default is -fenhanced-asm=1. Greater levels of verbosity will include more provenance information for inlined code. Use -fenhanced-asm=0 to disable.
-fenhanced-ir=<verbosity>

Emit descriptive comments in IR output. The default is -fenhanced-ir=1. Greater levels of verbosity will include more provenance information for inlined code. Use -fenhanced-ir=0 to disable.

Performance Options

Clang does not apply optimizations unless they are requested. For best performance, -Ofast with -flto is recommended. For applications that are sensitive to floating-point optimizations, it may be necessary to adjust the floating-point optimization level using one of the options below. For applications that require bit reproducibility (that is, applications that are designed to calculate the same result no matter how the work is distributed among a constant product of MPI ranks and OpenMP threads), it may be necessary to forgo floating-point optimization by using -O3 instead of -Ofast.

-fast

Enables -Ofast and link-time optimization.
-ffp=level

Selects a level for HPE Cray floating-point math optimizations and math library functions. Requesting the lowest level, -ffp=0, generates code with the highest precision and grants the compiler minimal freedom to optimize floating-point operations, whereas requesting the highest level, -ffp=4, grants the compiler maximal freedom to aggressively optimize but likely results in lower precision.

Requesting levels 1 through 4 flushes denormals to zero and implies -funsafe-math-optimizations and -fno-math-errno; if those options are subsequently changed, then this option may not work as expected. With -fcray, -ffp=3 is implied by -ffast-math or -Ofast. Using -ffp=0 prevents the use of HPE Cray math libraries and disable all HPE Cray floating-point optimizations.

Supported values for level are 0, 1, 2, 3, 4.
-fcray-mallopt, -fno-cray-mallopt

Optimize malloc by using HPE Cray custom mallopt parameters, which for most programs improves performance but may cause higher memory usage. This is a link-time option. The default is -fcray-mallopt.
-fivdep, -fno-ivdep

Enables or disables #pragma ivdep handling. The default is -fivdep.
-flocal-restrict, -fno-local-restrict

Honors restrict-qualified pointers declared in a block scope by assuming that they do not alias with other restrict-qualified pointers declared in the same block scope. The default is -flocal-restrict.
-floop-trips=scale

Optimizes, assuming loops with statically unknown trip counts have trip counts, at the scale of scale.

At this time, the only valid value for scale is huge. Assume loops have trip counts large enough such that referenced data will not fit in the cache.

Feature Options

Clang options supporting CCE features include:

-fsave-decompile

Generates decompile (.dc) and IR (.ll) files before optimization, vectorization, and code generation, as well as after LTO. A decompile is a higher-level presentation of the IR that looks similar to C source code but cannot be compiled. Uses the decompile to gain insight about restructuring and optimization changes made by the compiler.
-fsave-loopmark

Generates a loopmark listing file (.lst) that shows which optimizations were applied to which parts of the source code.
-floopmark-style

Controls the style of the loopmark listing file produced when -fsave-loopmark is used. Allowed values are grouped (all messages placed at the end of the listing) and interspersed (each message placed after the relevant source code line). The default is grouped.
-finstrument-loops

Instruments loops to gather profile data to use with CrayPAT.
-finstrument-openmp

Turns the insertion of the CrayPat OpenMP and accelerator tracing calls on and off.
-fcray-program-library-path=<directory>

Creates and uses a persistent repository of compiler information specified by <directory>.

The program library repository is implemented as a directory and the information contained in the program library is built up with each compiler invocation. Any compilation that does not have the -fcray-program-library-path option will not add information to this repository.

Because of the persistence of the program library, the user is responsible for managing it. For example, rm -r <directory> might be added to the “make clean” target in an application Makefile. Because the program library is a directory, use rm -r to remove it.

If an application Makefile works by creating files in multiple directories during a single build, then <directory> must be an absolute path. Otherwise, multiple and incomplete program library repositories are created. For example, avoid -fcray-program-library-path=./pl and instead use -fcray-program-library-path=/fullpath/builddir/pl.

This option may be specified with either an equal sign or a space before directory.
-fcray-trapping-math

Generates optimized trap-safe floating point code. This option disables any optimization which would introduce a trap where one did not exist in the source code. The default is -fno-cray-trapping-math.

Linker Options

-ffpe-trap=list

Enable traps at runtime for the specified exceptions. This option accepts a comma separated list of values. If the specified values contradict each other, the last value specified has priority.

This option does not affect compile time optimizations; it detects runtime exceptions. This option is processed only at link time and affects the entire program; it is not processed when compiling subprograms. Therefore, traps may be set using this command line option at the beginning of execution of the main program only. The program may subsequently change these settings by calling intrinsic or library procedures.

The default is -ffpe-trap=none, which means no exceptions are trapped. Possible values with exceptions include:

none

Disables all traps
invalid

Trap on invalid operation
zero

Trap on divide-by-zero
fp

Trap on zero, invalid, or overflow
inexact

Trap on inexact result (or rounded result). Enabling traps for inexact results is not recommended.
overflow

Trap on overflow (or the result of an operation is too large to be represented)
underflow

Trap on underflow (or the result of an operation is too small to be represented)
denormal

Trap on denormalized operands

Uninitialized Variable Policy Control

Uninitialized variables can be a source of programming errors. Options listed below provide control over how the compiler treats these variables. Separate options for integer and floating-point types exist so that integer variables may be initialized to zero and floating-point variables may be initialized to NaN. Many bit patterns qualify as a NaN; these options use a quiet NaN of all ones because using a repeated byte pattern makes it possible to initialize large arrays using memset. Conversely, these options apply to integral and floating-point variables (which are not part of structures) because structures could require an arbitrarily complex initialization sequence.

-funinitialized-heap-ints=<uninitialized | zero>

Initializes integer memory allocated by malloc or new to zero. For this option to have any effect, the void pointer returned by malloc must be typecast immediately to a pointer to an integer type because otherwise the compiler does not know how the memory will be used. For example, (int*)malloc(...).
-funinitialized-heap-floats=<uninitialized | nan>

Initializes floating-point memory allocated by malloc or new to zero. For this option to have any effect, the void pointer returned by malloc must be typecast immediately to a quiet NaN of all ones. For this option to have any effect, the void pointer returned by malloc must be typecast immediately to a pointer to a floating-point type because otherwise the compiler does not know how the memory will be used. For example, (double*)malloc(...).
-funinitialized-stack-ints=<uninitialized | zero>

Initializes stack integer variables to zero. If the -ftrivial-auto-var-init option is present, then it has precedence, and this option does nothing.
-funinitialized-stack-floats=<uninitialized | nan>

Initializes stack floating-point variables to NaN. If the -ftrivial-auto-var-init option is present, then it has precedence, and this option does nothing.
-funinitialized-static-floats=<zero | nan>

Initializes static floating-point variables to NaN.

Unified Parallel C (UPC) Options

CCE Clang options that support UPC include:

-hupc, -hdefault

-hupc configures the compiler driver to expect UPC source code. Source files with a .upc extension are automatically treated as UPC code, but this option permits a file with any other extension (typically .c) to be understood as UPC code. -hdefault cancels this behavior; if both -hupc and -hdefault appear in a command line, whichever appears last takes precedence and applies to all source files in the command line.
-fupc-auto-amo, -fno-upc-auto-amo

Automatically uses network atomics for remote updates to reduce latency. For example, x += 1 can be performed as a remote atomic add. If an update is recognized as local to the current thread, then no atomic is used. These atomics are intended as a performance optimization only and should not be relied upon to prevent race conditions. Enabled at -O1 and above.
-fupc-buffered-async, -fno-upc-buffered-async

Sets aside memory in the UPC runtime library for aggregating random remote accesses designated with #pragma pgas buffered_async. Disabled by default.
-fupc-pattern, -fno-upc-pattern

Identifies simple communication loops and aggregate the remote accesses into a single function call which replaces the loop. Enabled at -O1 and above.
-fupc-threads=<N>

Sets the number of threads for a static THREADS translation. This option causes __UPC_STATIC_THREADS__ to be defined instead of __UPC_DYNAMIC_THREADS__ and replaces all uses of the UPC keyword THREADS with the value <N>.

HIP Support and Options

HIP is supported only for AMD GPU targets and requires an AMD ROCm install for HIP header files and runtime libraries.

Several flags must be specified explicitly to compile and link HIP source files. For example, the following command lines will compile and link a HIP source file targeting an AMD MI250X GPU:

CC --offload-arch=gfx90a --rocm-path=<ROCM-INSTALL-PATH> -c -x hip [options] <filename> ...
CC --rocm-path=<ROCM-INSTALL-PATH> [options] <filename> ...

The following compiler options are relevant for compiling and linking HIP source files:

-x hip

Enable HIP compilation for any input files that appear after this option on the command line. This option should not be used on a link line with object files as input, since CCE will treat the object files as HIP source. The -x none flag can be used to cancel a prior -x hip flag on the link line.
--rocm-path=<ROCM-INSTALL-PATH>

Specifies the location of a ROCm install; used to locate HIP header files and device runtime libraries.
--offload-arch=[gfx908|gfx90a|gfx942]

Specifies the HIP offload target architecture. CCE currently supports gfx908 (AMD MI00), gfx90a (AMD MI250X), and gfx942 (AMD MI300A). This flag can be specified multiple times to produce a fat binary that contains device code for multiple GPUs.

This flag also accepts the LLVM target ID syntax, which is a target processor followed by a colon-delimited list of processor features. Each feature is a predefined string, xnack or sramecc, followed by a plus or minus sign to enable or disable the setting (for example, gfx90a:xnack+ or gfx90a:xnack-). Any unspecified processor features receive a default value of any, which ensures the resulting executable runs correctly on a processor with or without that feature. The xnack processor feature is needed to run with unified memory for AMD GPUs.
--cuda-offload-arch=[gfx908|gfx90a|gfx942]

A synonym for --offload-arch.
-fgpu-rdc, -fno-gpu-rdc

Generates relocatable device code, allowing separate compilation of HIP source files with cross-file references. Compiling with -fgpu-rdc will produce a bundled HIP offload object file that requires linking with --hip-link. Compiling with -fno-gpu-rdc will produce ordinary host object files that do not need to be linked with --hip-link. However, -fno-gpu-rdc requires that all HIP device code in a HIP source file must be completely self-contained, without referencing any external user-defined symbols. The default is -fno-gpu-rdc.
--hip-link

Enables device linking for bundled HIP offload object files. This option is required when linking object files compiled with -fgpu-rdc.
--munsafe-fp-atomics, mno-unsafe-fp-atomics

Enables the use of native floating-point atomic instructions, which are not used at default for AMD MI250X GPUs because they are only safe for coarse-grained memory; floating-point atomic instructions operating on fine-grained memory are silently ignored. In general, memory granularity cannot be determined statically, so at default, the compiler always generates atomic compare-and-swap loops for floating-point atomic operations. (Integer atomic instructions, including atomic compare-and-swap, are safe for any memory granularity.) The munsafe-fp-atomics compiler flag may be used to enable the generation of native floating-point atomic instructions, but you must ensure that atomic operations do not target fine-grained memory. The default is mno-unsafe-fp-atomics, which prevents the compiler from generating native floating-point atomic instructions for operations that may target fine-grained memory at runtime.

C and C++ Language Extensions

This chapter describes the language extensions provided by CCE Clang. Some of these extensions are widely implemented in other compilers, while others are unique and specific to HPE Cray systems. Note also that CCE Clang supports regular Clang language extensions.

Performance Extensions

#pragma ivdep

If placed before a for, while, or do while loop, #pragma ivdep causes the compiler to ignore vector dependencies in the loop (including explicit dependencies, when attempting to vectorize the loop) and allows the compiler to vectorize many loops that are potentially unsafe to vectorize.

Reductions within the loop are allowed, except for reductions into global arrays. For example, a[0] += 3 is not allowed if a is a global array.

Even with #pragma ivdep, conditions other than vector dependencies can still inhibit vectorization.

Interoperability

Mixed-language programs that exchange long double data between Fortran and C or C++ object files do not work correctly on x86 targets. CCE Fortran assumes a 64-bit C_LONG_DOUBLE type, whereas Clang uses an 80-bit long double type padded to 128 bits of storage. To assist in making such programs work, the following options are available.

Note that if you are using a non-default long double format, avoid passing the long double data to library functions which expect the default format.

-mlong-double-64

Make the x86 “long double” type equivalent to the “double” type. This type matches CCE Fortran C_DOUBLE or C_LONG_DOUBLE.
-mlong-double-128

Make the x86 “long double” type equivalent to the “__float128” type. This type matches CCE Fortran C_FLOAT128.
-mlong-double-80

Make the x86 “long double” type equivalent to an 80-bit floating-point type that is padded to 128 bits of storage. This option is the default. Additionally, this Fortran option is relevant to interoperability.
-ffortran-byte-swap-io

Tell the Fortran runtime I/O subsystem to byte-swap input and output files for direct and sequential unformatted I/O. This is a link-time option to be used when linking with CCE Fortran object files.

Language Extensions

#pragma ivdep

If placed before a for, while, or do while loop, #pragma ivdep causes the compiler to ignore vector dependencies in the loop (including explicit dependencies) when attempting to vectorize the loop. This process allows the compiler to vectorize many loops that are potentially unsafe to vectorize.

Note that reductions within the loop are allowed, except for reductions into global arrays. For example, a[0] += 3 is not allowed if a is a global array.

With #pragma ivdep, conditions other than vector dependencies can still inhibit vectorization.

Options

This section details information for:

Options

Stage Selection Options

Option	Description
`-E`	Runs the preprocessor stage.

`-fsyntax-only`	Runs the preprocessor, parser, and semantic analysis stages.

`-S`	Runs the previous stages, as well as LLVM generation and optimization stages, and
	target-specific code generation, producing an assembly file.

`-c`	Runs the above `-E`, `-fsyntax-only`, and `-S` options, plus the assembler,
	generating a target `.o` object file.

no stage selection	If no stage selection option is specified, all stages before this stage are run, and
option	the linker is run to combine the results into an executable or shared library.

Language Selection and Mode Options

Option	Description
`-x <language>`	Treats subsequent input files as having type language.

`-std=<standard>`	Specifies the language standard for which to compile.

	C Language

	Supported values for the C language are:

	- `c89`
	- `c90`
	- `iso9899:1990`
	ISO C 1990

	- `iso9899:199409`
	ISO C 1990 with amendment 1

	- `gnu89`
	- `gnu90`
	ISO C 1990 with GNU extensions

	- `c99`
	- `iso9899:1999`
	ISO C 1999

	- `gnu99`
	ISO C 1999 with GNU extensions

	- `c11`
	- `iso9899:2011`
	ISO C 2011

	- `gnu11`
	ISO C 2011 with GNU extensions

	- `c17`
	- `iso9899:2017`
	ISO C 2017

	- `gnu17`
	ISO C 2017 with GNU extensions. The default C language standard
	is `gnu17`, except on PS4, where it is `gnu99`.

	C++ Language

	Supported values for the C++ language are:

	- `c++98`
	- `c++03`
	ISO C++ 1998 with amendments

	- `gnu++98`
	- `gnu++03`
	ISO C++ 1998 with amendments and GNU extensions

	- `c++11`
	ISO C++ 2011 with amendments

	- `gnu++11`
	ISO C++ 2011 with amendments and GNU extensions

	- `c++14`
	ISO C++ 2014 with amendments

	- `gnu++14`
	ISO C++ 2014 with amendments and GNU extensions

	- `c++17`
	ISO C++ 2017 with amendments

	- `gnu++17`
	ISO C++ 2017 with amendments and GNU extensions

	- `c++20`
	ISO C++ 2020 with amendments

	- `gnu++20`
	ISO C++ 2020 with amendments and GNU extensions

	- `c++23`
	ISO C++ 2023 with amendments

	- `gnu++23`
	ISO C++ 2023 with amendments and GNU extensions

	- `c++2c`
	Working draft for C++2c

	- `gnu++2c`
	Working draft for C++2c with GNU extensions. The default C++
	language standard is `gnu++17`.

	OpenCL Language

	Supported values for the OpenCL language are:

	- `cl1.0`
	OpenCL 1.0

	- `cl1.1`
	OpenCL 1.1

	- `cl1.2`
	OpenCL 1.2

	- `cl2.0`
	OpenCL 2.0. The default OpenCL language standard is `cl1.0`.

	CUDA Language

	The supported value for the CUDA language is:

	- `cuda`
	NVIDIA CUDA:tm:

`-stdlib=<library>`	Specifies the C++ standard library to use; supported options are
	`libstdc++` and `libc++`. If not specified, the platform default is
	used.

`-rtlib=<library>`	Specifies the compiler runtime library to use; supported options
	are `libgcc` and `compiler-rt`. If not specified, `compiler-rt`
	is used if the `-fcray` option is enabled, otherwise the platform
	default is used.

`-ansi`	Same as `-std=c89`.

`-ObjC, -ObjC++`	Treats source input files as Objective-C and Object-C++ inputs
	respectively.

`-trigraphs`	Enables trigraphs.

`-ffreestanding`	Indicates that the file should be compiled for a freestanding (not
	a hosted) environment. It is assumed that a freestanding environment
	also provides `memcpy`, `memmove`, `memset`, and `memcmp`
	implementations, as these options are needed for efficient codegen
	for many programs.

`-fno-builtin`	Disables special handling and optimizations of well-known library
	functions, like :c:func:`strlen` and :c:func:`malloc`.

`-fno-builtin-<function>`	Disables special handling and optimizations for the specific library
	function. For example, `-fno-builtin-strlen` removes any special
	handling for the :c:func:`strlen` library function.

`-fno-builtin-std-<function>`	Disables special handling and optimizations for the specific C++
	standard library function in the namespace `std`. For example,
	`-fno-builtin-std-move_if_noexcept` removes any special
	handling for the :cpp:func:`std::move_if_noexcept` library
	function.

	For C standard library functions that the C++ Standard Library also
	provides in namespace `std`, uses :option:`-fno-builtin-`
	`\<function\>` instead.

`-fmath-errno`	Indicates that math functions should be treated as updating
	:c:data:`errno`.

`-fpascal-strings`	Enables support for Pascal-style strings with “\pfoo”.

`-fms-extensions`	Enables support for Microsoft extensions.

`-fmsc-version=`	Sets `_MSC_VER`. If on Windows, this selection defaults to
	either the same value as the currently installed version of
	`cl.exe` or `1933`. This option is not set otherwise.

`-fborland-extensions`	Enables support for Borland extensions.

`-fwritable-strings`	Makes all string literals default to writable. Disables the creation
	of unique strings and other optimizations.

`-flax-vector-conversions`	Allows loose type checking rules for implicit vector conversions.
`-flax-vector-conversions=<kind>`	Possible values of :
`-fno-lax-vector-conversions`
	- `none` - Disallows implicit conversions between vectors.
	- `integer` - Allows implicit bitcasts between integer vectors
	of the same overall bit-width.
	- `all` - Allows implicit bitcasts between any vectors of the same
	overall bit-width.

	`<kind>` defaults to `integer`, if unspecified.

`-fblocks`	Enables the “Blocks” language feature.

`-fobjc-abi-version=version`	Selects the Objective-C ABI version to use. Available versions
	are:

	- 1 (legacy fragile ABI)
	- 2 (non-fragile ABI 1)
	- 3 (non-fragile ABI 2)

`-fobjc-nonfragile-abi-version=<version>`	Selects the default version of Objective-C non-fragile ABI to use.
	This selection is only used as the Objective-C ABI if the non-fragile
	ABI is enabled (either through :option:`-fobjc-nonfragile-abi`
	or because it is the platform default).

`-fobjc-nonfragile-abi`	Enables the use of Objective-C non-fragile ABI. On platforms for
`-fno-objc-nonfragile-abi`	which this is the default ABI, it can be disabled with
	:option:`-fno-objc-nonfragile-abi`.

Target Selection Options

Clang fully supports cross compilation as an inherent part of its design. Depending on how your version of Clang is configured, it may have support for a number of cross-compilers or may only support a native target.

Option	Description
`.. option:: -arch <architecture>`	Specifies the architecture to build for (Mac OS X specific).

`.. option:: -target <architecture>`	Specifies the architecture to build for (all platforms).

`.. option:: -mmacosx-version-min=<version>`	If building for macOS, specifies the minimum version supported by your
	application.

`.. option:: -miphoneos-version-min`	If building for iPhone OS, specify the minimum version supported by
	your application.

`.. option:: --print-supported-cpus`	Prints out a list of supported processors for the given target
	(specified through `--target=<architecture>` or :option:`-arch`
	`<architecture>`). If no target is specified, the system default
	target is used.

`.. option:: -mcpu=?, -mtune=?`	Acts as an alias for :option:`--print-supported-cpus`.

`.. option:: -mcpu=help, -mtune=help`	Acts as an alias for :option:`--print-supported-cpus`.

`.. option:: -march=<cpu>`	Specifies that Clang should generate code for a specific processor
	family member and later. For example, if you specify `-march=i486`,
	the compiler is allowed to generate instructions that are valid on
	i486 and later processors, but which may not exist on earlier ones.

Code Generation Options

Option	Description
`.. option:: -O0, -O1, -O2, -O3, -Ofast,`	Specifies which optimization level to use:
`-Os, -Oz, -Og, -O, -O4`
	- `-O0` - Indicates “no optimization”. This level compiles the
	fastest and generates the most debuggable code.

	- `-O1` - Optimization level is between :option:`-O0` and
	:option:`-O2`.

	- `-O2` - Moderate level of optimization which enables most
	optimizations.

	- `-O3` - Similar to :option:`-O2`, except that it enables
	optimizations that take longer to perform or may generate larger
	code (in an attempt to make the program run faster).

	- `-Ofast` - Enables all the optimizations from :option:`-O3`
	along with other aggressive optimizations that may violate strict
	compliance with language standards.

	- `-Os` - Similar to :option:`-O2`, except with extra optimization
	to reduce code size.

	- `-Oz` - Similar to :option:`-Os` (and thus :option:`-O2`), but
	reduces code size further.

	- `-Og` - Similar to :option:`-O1`.

	- `-O` - Equivalent to :option:`-O1`.

	- `-O4` and higher. Equivalent to :option:`-O3`.

`.. option:: -g, -gline-tables-only, -gmodules`	Controls debug information output. Clang debug information works
	best at :option:`-O0`. If more than one option starting with
	`-g` is specified, the last one wins:

	- :option:`-g` - Generates debug information.

	- :option:`-gline-tables-only` - Generates only line table debug
	information. This allows for symbolicated backtraces with inlining
	information, but does not include any information about variables,
	their locations or types.

	- :option:`-gmodules` - Generates debug information that contains
	external references to types defined in Clang modules or precompiled
	headers instead of emitting redundant debug type information into
	every object file. This option transparently switches the Clang module
	format to object file containers that hold the Clang module together
	with the debug information. When compiling a program that uses
	Clang modules or precompiled headers, this option produces complete
	debug information with faster compile times and much smaller object
	files.

	This option should not be used if you are building static libraries for
	distribution to other machines. This constraint is in place because the
	debug info contains references to the module cache on the machine on
	which the object files in the library were built.

`.. option:: -fstandalone-debug`	Clang supports a number of optimizations to reduce the size of
`-fno-standalone-debug`	debug information in the binary. They work based on the assumption
	that the debug type information can be spread out over multiple
	compilation units. For instance, Clang will not emit type definitions
	for types that the module does not need and could be replaced with a
	forward declaration. Further, Clang only emits type information for a
	dynamic C++ class in the module that contains the table for the class.

	The :option:`-fstandalone-debug` option turns off these
	optimizations. This option is useful if you are working with third-
	party libraries that do not come with debug information. This option
	is the default on Darwin. Note that Clang never emits type information
	for types that are not referenced at all by the program.

`.. option:: -feliminate-unused-debug-types`	By default, Clang does not emit type information for types that
	are defined but not used in a program. To retain the debug information
	for these unused types, the `-fno-eliminate-unused-debug-types`
	negation can be used.

`.. option:: -fexceptions`	Allows exceptions to be directed to the Clang compiled stack frames
	(on many targets, this allowance enables unwind information for
	functions that might have an exception directed to them). For most
	targets, this option is enabled by default for C++.

`.. option:: -ftrapv`	Generates code to catch integer overflow errors. Signed integer
	overflow is undefined in C. With this flag, extra code is generated to
	detect this operation and abort whenever it happens.

`.. option:: -fvisibility`	Sets the default visibility level.

`.. option:: -fcommon, -fno-common`	Specifies that variables without initializers get common linkage.
	It can be disabled with :option:`-fno-common`.

`.. option:: -ftls-model=<model>`	Sets the default thread-local storage (TLS) model to use for
	thread-local variables. Valid values are:

	- “global-dynamic”
	- “local-dynamic”
	- “initial-exec”
	- “local-exec”

	The default is “global-dynamic”. The default model can be overridden
	with the `tls_model` attribute. The compiler attempts to choose a more
	efficient model, if possible.

`.. option:: -flto, -flto=full, -flto=thin, -emit-llvm`	Generates output files in LLVM formats, suitable for link time
	optimization. If used with :option:`-S`, generates LLVM intermediate
	language assembly files. Otherwise, generates LLVM bitcode format
	object files (which may be passed to the linker, depending on the
	stage selection options).

	The default for :option:`-flto` is “full”, indicating that the
	LLVM bitcode is suitable for monolithic Link Time Optimization (LTO),
	where the linker merges all such modules into a single combined
	module for optimization. With “thin”, :doc:`ThinLTO <../ThinLTO>`
	compilation is initiated instead.

`.. note::`	On Darwin, if you are using :option:`-flto` along with :option:`-g`
	and compiling and linking in separate steps, you must also pass
	`-Wl,-object_path_lto,<lto-filename>.o` at the linking step to
	instruct the ld64 linker not to delete the temporary object file
	generated during Link Time Optimization (this flag is automatically
	passed to the linker by Clang if compilation and linking are done in a
	single step). This action allows for the debugging of the executable,
	as well as generating the `.dSYM` bundle using
	:manpage:`dsymutil(1)`.

Driver Options

Option	Description
`.. option:: -###`	Prints (but does not run) commands for running this compilation.

`.. option:: --help`	Displays available options.

`.. option:: -Qunused-arguments`	Refrains from emitting any warnings for unused driver arguments.

`.. option:: -Wa,<args>`	Passes comma-separated arguments in `args` to the assembler.

`.. option:: -Wl,<args>`	Passes comma-separated arguments in `args` to the linker.

`.. option:: -Wp,<args>`	Passes comma-separated arguments in `args` to the preprocessor.

`.. option:: -Xanalyzer <arg>`	Passes `arg` to the static analyzer.

`.. option:: -Xassembler <arg>`	Passes `arg` to the assembler.

`.. option:: -Xlinker <arg>`	Passes `arg` to the linker.

`.. option:: -Xpreprocessor <arg>`	Passes `arg` to the preprocessor.

`.. option:: -o <file>`	Writes output to file.

`.. option:: -print-file-name=<file>`	Prints the full library path of file.

`.. option:: -print-libgcc-file-name`	Prints the library path for the currently used compiler runtime library
	(“libgcc.a” or “libclang_rt.builtins.*.a”).

`.. option:: -print-prog-name=<name>`	Prints the full program path of name.

`.. option:: -print-search-dirs`	Prints the paths used for finding libraries and programs.

`.. option:: -save-temps`	Saves intermediate compilation results.

`.. option:: -save-stats,`	Save internal code generation (LLVM) statistics to a file in the current
`-save-stats=cwd, -save-stats=obj`	directory (:option:`-save-stats`/”-save-stats=cwd”) or the directory
	of the output file (“-save-state=obj”).

	You can also use environment variables to control the statistics reporting.
	Setting `CC_PRINT_INTERNAL_STAT` to `1` enables the feature, the
	report goes to stdout in JSON format.

	Setting `CC_PRINT_INTERNAL_STAT_FILE` to a file path makes it report
	statistics to the given file in the JSON format.

	Note that `-save-stats` take precedence over `CC_PRINT_INTERNAL_STAT`
	and `CC_PRINT_INTERNAL_STAT_FILE`.

`.. option:: -integrated-as, -no-integrated-as`	Enables and disables, respectively, the use of the integrated assembler.
	Whether the integrated assembler is on by default is target dependent.

`.. option:: -time`	Time individual commands.

`.. option:: -ftime-report`	Prints timing summary of each stage of compilation.

`.. option:: -v`	Shows commands to run and use verbose output.

Diagnostic Options

Option	Description
`.. option:: -fshow-column`,	Controls how Clang prints out information about diagnostics (errors and
`-fshow-source-location`,	warnings). See the Clang User’s Manual for more information.
`-fcaret-diagnostics`,
`-fdiagnostics-fixit-info`,
`-fdiagnostics-parseable-fixits`,
`-fdiagnostics-print-source-range-info`,
`-fprint-source-range-info`,
`-fdiagnostics-show-option`,
`-fmessage-length`

Preprocessor Options

Option	Description
`.. option:: -D<macroname>=<value>`	Adds an implicit #define into the predefined buffer which is read before
	the source file is preprocessed.

`.. option:: -U<macroname>`	Adds an implicit #undef into the predefined buffer which is read before
	the source file is preprocessed.

`.. option:: -include <filename>`	Adds an implicit #include into the predefined buffer which is read before
	the source file is preprocessed.

`.. option:: -I<directory>`	Adds the specified directory to the search path for include files.

`.. option:: -F<directory>`	Adds the specified directory to the search path for framework include
	files.

`.. option:: -nostdinc`	Do not search the standard system directories or compiler built-in
	directories for include files.

`.. option:: -nostdlibinc`	Do not search the standard system directories for include files, but do
	search compiler built-in include directories.

`.. option:: -nobuiltininc`	Do not search Clang’s built-in directory for include files.

`.. option:: -fkeep-system-includes`	Usable only with :option:`-E`. Do not copy the preprocessed content of
	“system” headers to the output; instead, preserve the #include directive.
	This selection can greatly reduce the volume of text produced by
	:option:`-E` which can be helpful when trying to produce a “small”
	reproduceable test case.

	This option does not guarantee reproducibility, however. If the including
	source defines preprocessor symbols that influence the behavior of system
	headers (for example, `_XOPEN_SOURCE`) the operation of :option:`-E`
	will remove that definition and thus can change the semantics of the
	included header. Also, using a different version of the system headers
	(especially a different version of the STL) may result in different
	behavior. Always verify the preprocessed file by compiling it separately.

Environment Options

Option	Description
`.. envvar:: MALLOC_MMAP_MAX_`	Specifies the maximum number of memory chunks to allocate with mmap. If
	:option:`-fcray-mallopt` (default) is used, the compiler changes this
	from the glibc default to 0. For most HPC programs, runtime performance
	is improved by this setting, but more memory may be consumed. The default
	glibc behavior can be restored by linking with :option:`-fno-cray-mallopt`
	or setting :envvar:`CRAY_MALLOPT_OFF` at runtime. A custom setting of
	:envvar:`MALLOC_MMAP_MAX_` (see :manpage:`mallopt(3)` for details) also
	override this Cray change.

`.. envvar:: MALLOC_TRIM_THRESHOLD_`	Specifies the minimum size of the unused memory region at the top of the
	heap before the region is returned to the operating system. If
	:option:`-fcray-mallopt` (default) is used, the compiler changes this
	from the glibc default to 536870912 bytes. For most HPC programs, this
	setting improves runtime performance, but more memory may be consumed.
	The default glibc behavior can be restored by linking with
	:option:`-fno-cray-mallopt` or setting :envvar:`CRAY_MALLOPT_OFF`
	at runtime. A custom setting of :envvar:`MALLOC_TRIM_THRESHOLD_` (see
	:manpage:`mallopt(3)` for details) will also override this Cray change.

`.. envvar:: TMPDIR, TEMP, TMP`	These environment variables are checked, in order, for the location to
	write temporary files used during the compilation process.

`.. envvar:: CPATH`	If this environment variable is present, it is treated as a delimited list
	of paths to be added to the default system include path list. The delimiter
	is the platform dependent delimiter, as used in the PATH environment
	variable.

	Empty components in the environment variable are ignored.

`.. envvar:: C_INCLUDE_PATH,`	These environment variables specify additional paths, as for
`OBJC_INCLUDE_PATH, CPLUS_INCLUDE_PATH,`	:envvar:`CPATH`, which are only used when processing the appropriate
`OBJCPLUS_INCLUDE_PATH`	language.

`.. envvar:: MACOSX_DEPLOYMENT_TARGET`	If :option:`-mmacosx-version-min` is unspecified, the default deployment
	target is read from this environment variable. This option only affects
	Darwin targets.