As a beginner or immediate C++ programmer, you heard never mixing unsigned and signed integer types or avoiding unsigned integers at all. There was also this talk about undefined behaviour. Yet, in embedded software development, there is no way around unsigned integers – so what is behind all these warnings?

Wonderful that you strive to know the facts! I’m gladly taking you on a short journey – exploring some of the deepest abysses of the C++ language. After this experience, I hope you will perfectly understand how the compiler handles integer operations and therefore write better and more reliable code.

## Preface

Before we start, I like to talk about a few boring but crucial premises. Even though I address embedded development, all the facts are valid for desktop applications too. I will use a modern C++ syntax in my examples but keep all code on the level of the C++17 standard.

Be aware of the differences between C and C++. Not all things said for C++ are automatically true for C. If you are still using C, please move on to a more modern language, like Rust, Go or C++.

You will find all code examples in a git repository with CMake build files. As these are merely demonstrations of the topics I explain in this article, I omitted most of the usual comments and API documentation to keep them small and readable.

## Integers and Processors

Wikipedia has an interesting article about integers. It states the word *integer* is Latin, meaning “whole” (or “untouched”). In mathematics, the term defines positive and negative whole numbers, including zero. This is basically true for computer processors as well, but for practical reasons, the numbers are limited to a certain size, and there is a special type with no negative numbers.

The size of an integer is given as the *number of bits,* and it is either signed or unsigned. Signed integers can represent negative numbers, while unsigned ones can only represent positive ones. On the hardware level, there is no difference between signed and unsigned integers. The only difference is that for signed integers, the highest bit is used to indicate a negative value.

The illustration above shows the most commonly used integer sizes: 8, 16, 32 and 64 bit. Many processors can handle 128-bit and larger integers too. For each size, there is a signed and unsigned version. On the left side of the bit illustration, I wrote the type names from the standard library for a type of the given size.

The format of this type specification is simple: Unsigned integers start with “`uint`

“, signed ones start with “`int`

“. This is followed by the number of bits `8`

, `16`

, `32`

or `64`

and ends with the suffix `_t`

mark it as a type of the standard library.

## Small Excursion through the C++ Standard

The C++ standard is particular fuzzy in how integer types are defined:

There are five standard signed integer types: “signed char”, “short int”, “int”, “long int”, and “long long int”. In this list, each type provides at least as much storage as those preceding it in the list. (…) The range of representable values for a signed integer type is −2^{N−1} to 2^{N−1} − 1 (inclusive), where N is called the width of the type.

– from ISO/IEC 14882:2020, Fundamental types

Also, the standard defines unsigned integers like this:

For each of the standard signed integer types, there exists a corresponding (but different) standard unsigned integer type: “unsigned char”, “unsigned short int”, “unsigned int”, “unsigned long int”, and “unsigned long long int”. (…) The range of representable values for the unsigned type is 0 to 2^{N−1} (inclusive); arithmetic for the unsigned type is performed modulo 2^{N}.

– from ISO/IEC 14882:2020, Fundamental types

This two sections leave much room for interpretation. Later, the sizes and behaviour of these fundamental types is further defined. There are minimum sizes defined for the types:

Type | Minimum Size in Bits |
---|---|

signed char | 8 |

short | 16 |

int | 16 (!) |

long | 32 |

long long | 64 |

The minimum size of the commonly used `int`

type can be as small as 16 bits. A C++ implementation could also decide to make these types larger than expected, like 32 bits for `char`

, `short`

and `int`

.

As the sizes of `short`

, `int`

and `long`

depending on the compiler implementation for a platform, the standard library comes with the header `<cstdint>`

where a set of types with defined sizes is declared:

`int8_t`

, `uint8_t`

,`int16_t`

, `uint16_t`

,`int32_t`

, `uint32_t`

,`int64_t`

, `uint64_t`

**Never rely on specific sizes of short, int and long!**

Use the `intX_t`

and `uintX_t`

type definitions from `<cstdint>`

to get an integer of the specified size. If an integer of that size does not exist for the platform, this definition must not be declared.

So if you rely on a 32-bit value, but someone tries to compile your code on a platform that does not support 32-bit values, it will fail instead of creating a program with unpredicted behaviour.

## Test the Language Implementation

Let’s start with a lightweight test of our compiler environment. We use a simple program checking out the implementation of all fundamental integer types.

### Project 01-Types

#include <TypeInfo.hpp> auto main() -> int { std::cout << "Fundamental Language Types:\n"; printTypeInfo<char >("char"); printTypeInfo<wchar_t >("wchar_t"); printTypeInfo<signed char >("signed char"); printTypeInfo<unsigned char >("unsigned char"); printTypeInfo<signed short >("signed short"); printTypeInfo<unsigned short >("unsigned short"); printTypeInfo<signed int >("signed int"); printTypeInfo<unsigned int >("unsigned int"); printTypeInfo<signed long >("signed long"); printTypeInfo<unsigned long >("unsigned long"); printTypeInfo<signed long long >("signed long long"); printTypeInfo<unsigned long long>("unsigned long long"); printTypeInfo<float >("float"); printTypeInfo<double >("double"); printTypeInfo<long double >("long double"); printTypeInfo<bool >("bool"); std::cout << "\nDefinitions from <cstdint>:\n"; printTypeInfo<int8_t >("int8_t"); printTypeInfo<uint8_t >("uint8_t"); printTypeInfo<int16_t >("int16_t"); printTypeInfo<uint16_t >("uint16_t"); printTypeInfo<int32_t >("int32_t"); printTypeInfo<uint32_t >("uint32_t"); printTypeInfo<int64_t >("int64_t"); printTypeInfo<uint64_t >("uint64_t"); // (...) return 0; }

I use a template function `printTypeInfo`

to display basic information about the type on the console. This template function is implemented in the header `TypeInfo.hpp`

. As you see, I added `char`

, but also the signed and unsigned version of `char`

. From the C++ standard:

Type char is a distinct type that has an implementation-defined choice of “signed char” or “unsigned char” as its underlying type. The values of type char can represent distinct codes for all members of the implementation’s basic character set. (…)

– from ISO/IEC 14882:2020, Fundamental types

First, `char`

is a distinct type, and second, it is either based on a `signed char`

or an `unsigned char`

.

When I compile and run the small executable on my *ARM based* computer using *clang*, I get the following output for the language-defined types.

Fundamental Language Types: char: int8 wchar_t: int32 signed char: int8 unsigned char: uint8 signed short: int16 unsigned short: uint16 signed int: int32 unsigned int: uint32 signed long: int64 unsigned long: uint64 signed long long: int64 unsigned long long: uint64 float: float double: double long double: double bool: uint8

The main types defined in `<cstdint>`

produce the output below:

Definitions from <cstdint>: int8_t: int8 uint8_t: uint8 int16_t: int16 uint16_t: uint16 int32_t: int32 uint32_t: uint32 int64_t: int64 uint64_t: uint64

Compile the example with the compilers and platforms of your choice. You will most likely see different results, especially for `short`

and `long`

. Also often `char`

is signed or unsigned, depending on the platform.

Contrary to the language types, like `int`

: If the types `int8_t`

– `uint64_t`

from the `<cstdint>`

header are defined for your compile environment, they must have the expected size and not be smaller or larger. This is defined in the C standard so that you can rely on these sizes.

## Unexpected Results with Integer Literals

To use certain integer values in your program, you’re writing them as literal values. There are many options for writing integer literals: Prefixes controlling the used number system and suffixes selecting the created integer type.

### Prefixes

Prefix | Meaning | Examples |
---|---|---|

no prefix | Decimal Number | `0` `56'293` `7000` |

`0b` or `0B` | Binary Number | `0b10010011` `0B00110011'00001111` |

`0` | Octal Number | `074` |

`0x` or `0X` | Hexadecimal Number | `0xffff` `0X20` `0x1000'47ab` |

You can (and should) use the apostrophe character `'`

to group digits for all number systems. If a decimal number literal contains a decimal point `.`

, it is interpreted as a floating point number.

### Suffixes

Suffix | Meaning | Examples |
---|---|---|

no suffix | no change | `5780` |

`l` or `L` | `long` | `5780l` `5780ul` |

`u` or `U` | `unsigned` | `5780u` |

`ll` or `LL` | `long long` | `5780ll` `5780ull` |

There are no suffixes to create `char`

, `unsigned char`

, `signed char`

and `short`

values from numbers. To create a char, you have to use the syntax like `'\xff'`

.

### Project 02-literals

Let’s try this out and see what types we get from several different literals:

#include <TypeInfo.hpp> auto main() -> int { std::cout << "Type from Literal:\n"; std::cout << "100 => " << TypeInfo(100).str() << "\n";; std::cout << "070 => " << TypeInfo(070).str() << "\n";; std::cout << "0b10001000 => " << TypeInfo(0b10001000).str() << "\n";; std::cout << "0b10001000u => " << TypeInfo(0b10001000u).str() << "\n";; std::cout << "0x1000 => " << TypeInfo(0x1000).str() << "\n";; std::cout << "0x1000u => " << TypeInfo(0x1000u).str() << "\n";; std::cout << "100u => " << TypeInfo(100u).str() << "\n";; std::cout << "100l => " << TypeInfo(100l).str() << "\n";; std::cout << "100ul => " << TypeInfo(100ul).str() << "\n";; std::cout << "100ll => " << TypeInfo(100ll).str() << "\n";; std::cout << "100ull => " << TypeInfo(100ull).str() << "\n";; }

After compiling this code, I get the following output:

Type from Literal: 100 => int32{100} 070 => int32{56} 0b10001000 => int32{136} 0b10001000u => uint32{136} 0x1000 => int32{4096} 0x1000u => uint32{4096} 100u => uint32{100} 100l => int64{100} 100ul => uint64{100} 100ll => int64{100} 100ull => uint64{100}

All literals without suffixes are converted into the `int`

type, if you add `u`

for unsigned, you get an `unsigned int`

. Adding `l`

will produce `long`

types and `ll`

will create `long long`

types.

### How Negative Numbers are Handled

You may have wondered why I only used positive values so far, even for the signed integer types. The reason is **there are no negative integer literals** in C++. Instead, the C++ language defines unary `+`

and `-`

operands. While the unary `+`

operand has no effect, the functionality of the unary `-`

operand is defined as follows:

The operand of the unary – operator shall have arithmetic or unscoped enumeration type and the result is the negation of its operand. Integral promotion is performed on integral or enumeration operands. The negative of an unsigned quantity is computed by subtracting its value from 2^{n}, where n is the number of bits in the promoted operand. The type of the result is the type of the promoted operand.

– from ISO/IEC 14882:2020, Unary operators

If you write a negative integer literal, like `-500`

, the compiler interprets this as `negation(500)`

. This is no problem for most integer literals, but at the extremes, you get side effects.

Let’s compile the following project and check its output:

### Project 03-negation

#include <TypeInfo.hpp> void literals() { std::cout << "Unexpected Side Effects:\n"; auto int8a = std::numeric_limits<int8_t>::min(); auto int8b = -128; int8_t int8c = -128; std::cout << "int8a = " << TypeInfo(int8a).str() << "\n"; std::cout << "int8b = " << TypeInfo(int8b).str() << "\n"; std::cout << "int8c = " << TypeInfo(int8c).str() << "\n\n"; auto int16a = std::numeric_limits<int16_t>::min(); auto int16b = -32768; int16_t int16c = -32768; std::cout << "int16a = " << TypeInfo(int16a).str() << "\n"; std::cout << "int16b = " << TypeInfo(int16b).str() << "\n"; std::cout << "int16c = " << TypeInfo(int16c).str() << "\n\n"; auto int32a = std::numeric_limits<int32_t>::min(); auto int32b = -2147483648; int32_t int32c = -2147483648; std::cout << "int32a = " << TypeInfo(int32a).str() << "\n"; std::cout << "int32b = " << TypeInfo(int32b).str() << "\n"; std::cout << "int32c = " << TypeInfo(int32c).str() << "\n\n"; auto int64a = std::numeric_limits<int64_t>::min(); auto int64b = -9223372036854775808ll; int64_t int64c = -9223372036854775808ll; std::cout << "int64a = " << TypeInfo(int64a).str() << "\n"; std::cout << "int64b = " << TypeInfo(int64b).str() << "\n"; std::cout << "int64c = " << TypeInfo(int64c).str() << "\n\n"; }

For each signed integer type, we create a variable and initialise it with the smallest possible value. First using `std::numeric_limits<T>::min()`

, second as integer literal with `auto`

as type and last with the same integer literal – forcing it into the expected type.

If you compile the project, you get the following warning:

warning: integer literal is too large to be represented in a signed integer type, interpreting as unsigned [-Wimplicitly-unsigned-literal] auto int64b = -9223372036854775808ll;

The reason for this warning is how the compiler reads the code. The literal is read without the negative sign and tried to match into the largest possible signed integer. This is not possible, as the number is too large. The compiler is forced to interpret the number as an unsigned integer, which is why it issues this warning.

*After* reading the literal, the *unary negation operator* is applied to the value. Yet, this operator has in this special case no effect, as you will see shortly.

### Signed Integer Ranges are not Balanced

The illustration below illustrates this fact using a 4-bit integer.

As you can see, a signed integer can represent more negative than positive numbers. This is true of all signed integers. There is always one negative value more, which is the source of many problems.

### Analysing the Output from the Project

With this knowledge, we understand the strange output from the project. The code was compiled using *clang* on a *64-bit AMD* platform.

Unexpected Side Effects: int8a = int8{-128} int8b = int32{-128} int8c = int8{-128} int16a = int16{-32768} int16b = int32{-32768} int16c = int16{-32768} int32a = int32{-2147483648} int32b = int64{-2147483648} int32c = int32{-2147483648} int64a = int64{-9223372036854775808} int64b = uint64{9223372036854775808} int64c = int64{-9223372036854775808}

The 8-bit and 16-bit values behave as expected, as there is no suffix to limit an integer literal to smaller types than `int`

, the values `-128`

and `-32678`

were interpreted as signed 32-bit values.

The first interesting oddity is how variable `int32b`

with value `-2'147'483'648`

is interpreted as 64-bit, even if it perfectly fits into a signed 32-bit integer. As you now understand, a compiler has to come to this conclusion.

If an integer-literal cannot be represented by any type in its list and an extended integer type can represent its value, it may have that extended integer type. (…) A program is ill-formed if one of its translation units contains an integer-literal that cannot be represented by any of the allowed types.

– from ISO/IEC 14882:2020, Integer literals

- The compiler reads the literal
`2'147'483'648`

, which is larger than the largest signed 32-bit integer (`2'147'483'64`

7). Therefore, it uses a 64-bit integer. - Next, the
*negate*operator is applied to the literal, which converts the 64-bit integer into a negative integer by subtracting its value from 2^{64}.

If you force the value back into a signed 32-bit integer, you end up with the correct value – this is also the case with the last problem, variable `int64b`

.

- The compiler reads the literal
`9'223'372'036'854'775'808`

. It does not fit into a signed 64-bit integer, and there is no larger fundamental integer type. Therefore, the compiler puts the value into an unsigned 64-bit value, issuing a warning. - Now, the negate operator is applied to the literal:

2^{64}–`9'223'372'036'854'775'808`

=>`9'223'372'036'854'775'808`

Compared to the change of the signed 32-bit to the signed 64-bit value, this situation is problematic (why the compiler issued the warning). If this literal is used as part of an expression, an unsigned type is used that could cause unexpected results.

void sideEffects() { std::cout << "Side Effects:\n"; auto r1 = -9'223'372'036'854'775'808ll / 10'000'000; std::cout << "-9'223'372'036'854'775'808ll / 10'000'000 = " << TypeInfo(r1).str() << "\n"; int64_t r2 = -9'223'372'036'854'775'808ll / 10'000'000; std::cout << "int64_t{-9'223'372'036'854'775'808ll / 10'000'000} = " << TypeInfo(r2).str() << "\n"; auto r3 = std::numeric_limits<int64_t>::min() / 10'000'000; std::cout << "std::numeric_limits<int64_t>::min() / 10'000'000 = " << TypeInfo(r3).str() << "\n"; }

The output from this part is shown below:

Side Effects: -9'223'372'036'854'775'808ll / 10'000'000 = uint64{922337203685} int64_t{-9'223'372'036'854'775'808ll / 10'000'000} = int64{922337203685} std::numeric_limits<int64_t>::min() / 10'000'000 = int64{-922337203685}

Because the dividend of the division is an unsigned integer, the result is a positive value instead of a negative one. If the result of the expression is assigned to a signed integer, the problem remains.

The safe way to get the correct result is by using `std::numeric_limits<int64_t>::min()`

.

- If you need an integer literal of a defined size and type, initialize it like this:
`int16_t{-12}`

- Use
`std::numeric_limits<T>::min()`

and`std::numeric_limits<T>::max()`

to use signed integer variables with the largest or smallest values. While using the maximum value as literal is not problematic, by using`numeric_limits`

you spell out your intention to use a special value in the integer range. - Be aware that the minus sign in front of numbers is an operator and not part of the literal.
- If undefined behaviour is involved, the code may work fine with one compiler or even one configuration (debug) but fail to be compiled by another compiler or in another configuration (release).
- Enable all warnings, e.g. with
`-Wall`

and`-Wextra`

. Never ignore warnings.

## Signed Integer Math is Strange

Math with signed integers has several weaknesses. Let’s discover them. First, as mentioned before, signed integers have more negative numbers than positive ones. There is also always one negative number that cannot be converted into a positive equivalent.

### Make Negative Numbers Positive / Project 04-math

void makePositive() { std::cout << "Make Positive:\n"; int8_t r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; std::cout << "int8_t r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; r1 = " << TypeInfo(r1).str() << "\n"; int32_t r2 = std::numeric_limits<int32_t>::min() * -1; std::cout << "int32_t r2 = std::numeric_limits<int8_t>::min() * -1; r2 = " << TypeInfo(r2).str() << "\n"; int8_t r3 = std::abs(std::numeric_limits<int8_t>::min()); std::cout << "int8_t r3 = std::abs(std::numeric_limits<int8_t>::min()); r3 = " << TypeInfo(r3).str() << "\n"; int32_t r4 = std::abs(std::numeric_limits<int32_t>::min()); std::cout << "int32_t r4 = std::abs(std::numeric_limits<int32_t>::min()); r4 = " << TypeInfo(r4).str() << "\n\n"; }

If I compile the code above, I get the following two warning messages:

04-math/src/main.cpp:12:54: warning: overflow in expression; result is -2147483648 with type 'int' [-Winteger-overflow] int32_t r2 = std::numeric_limits<int32_t>::min() * -1; ^ 04-math/src/main.cpp:10:52: warning: implicit conversion from 'int' to 'int8_t' (aka 'signed char') changes value from 128 to -128 [-Wconstant-conversion] int8_t r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; ~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~

Because of the two warning messages, we have to expect problems with the first two calculations (`r1`

and `r2`

). For the last two calculations, there are no warnings.

Make Positive: int8_t r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; r1 = int8{-128} int32_t r2 = std::numeric_limits<int8_t>::min() * -1; r2 = int32{-2147483648} int8_t r3 = std::abs(std::numeric_limits<int8_t>::min()); r3 = int8{-128} int32_t r4 = std::abs(std::numeric_limits<int32_t>::min()); r4 = int32{-2147483648}

By using `auto`

the compiler can choose the result type:

void makePositiveWithAuto() { std::cout << "Make Positive using auto:\n"; auto r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; std::cout << "auto r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; r1 = " << TypeInfo(r1).str() << "\n"; auto r2 = std::numeric_limits<int32_t>::min() * -1; std::cout << "auto r2 = std::numeric_limits<int32_t>::min() * -1; r2 = " << TypeInfo(r2).str() << "\n"; auto r3 = std::abs(std::numeric_limits<int8_t>::min()); std::cout << "auto r3 = std::abs(std::numeric_limits<int8_t>::min()); r3 = " << TypeInfo(r3).str() << "\n"; auto r4 = std::abs(std::numeric_limits<int32_t>::min()); std::cout << "auto r4 = std::abs(std::numeric_limits<int32_t>::min()); r4 = " << TypeInfo(r4).str() << "\n"; auto r5 = static_cast<uint32_t>(std::abs(std::numeric_limits<int32_t>::min())); std::cout << "auto r5 = static_cast<uint32_t>(std::abs(std::numeric_limits<int32_t>::min())); r5 = " << TypeInfo(r5).str() << "\n\n"; }

Make Positive using auto: auto r1 = std::numeric_limits<int8_t>::min() * int8_t{-1}; r1 = int32{128} auto r2 = std::numeric_limits<int32_t>::min() * -1; r2 = int32{-2147483648} auto r3 = std::abs(std::numeric_limits<int8_t>::min()); r3 = int32{128} auto r4 = std::abs(std::numeric_limits<int32_t>::min()); r4 = int32{-2147483648} auto r5 = static_cast<uint32_t>(std::abs(std::numeric_limits<int32_t>::min())); r5 = uint32{2147483648}

Because `int8_t`

is automatically converted into an `int32_t`

, these results are correct now.

As the literal `-1`

is interpreted as `int`

, which is defined as a 32-bit integer, the multiplication for `r2`

and `r4`

does not change the integer size of the result. As there is no matching positive number of the minimum, the result is incorrect.

The compiler issues a warning for the expression at `r2`

, but there is no warning if you use `std::abs`

. For the result `r4`

the function `std::abs`

has no effect, because bit-wise negation is correct.

If you cast the result to an unsigned integer, as shown with `r5`

, you see the correct result.

### Make Signed to Unsigned Absolute

If your software at one point switches from signed to unsigned math and you have to deal with negative numbers, one solution is a simple function converting signed into absolute unsigned numbers.

template<typename T> constexpr auto unsignedAbs(T value) -> std::make_unsigned_t<T> { static_assert(std::is_integral_v<T>); if constexpr (std::is_signed_v<T>) { using R = std::make_unsigned_t<T>; return (value == std::numeric_limits<T>::min()) ? (static_cast<R>(std::numeric_limits<T>::max()) + R{1u}) : static_cast<R>(std::abs(value)); } else { return value; } } void makeUnsignedPositive() { std::cout << "Make Unsigned Positive:\n"; auto r1 = unsignedAbs(std::numeric_limits<int8_t>::min()); std::cout << "auto r1 = unsignedAbs(std::numeric_limits<int8_t>::min()); r1 = " << TypeInfo(r1).str() << "\n"; auto r2 = unsignedAbs(std::numeric_limits<int16_t>::min()); std::cout << "auto r2 = unsignedAbs(std::numeric_limits<int16_t>::min()); r2 = " << TypeInfo(r2).str() << "\n"; auto r3 = unsignedAbs(std::numeric_limits<int32_t>::min()); std::cout << "auto r3 = unsignedAbs(std::numeric_limits<int32_t>::min()); r3 = " << TypeInfo(r3).str() << "\n"; auto r4 = unsignedAbs(std::numeric_limits<int64_t>::min()); std::cout << "auto r4 = unsignedAbs(std::numeric_limits<int64_t>::min()); r4 = " << TypeInfo(r4).str() << "\n\n"; }

This function has the benefit of returning a valid positive number. If you look at the results, you get:

Make Unsigned Positive: auto r1 = unsignedAbs(std::numeric_limits<int8_t>::min()); r1 = uint8{128} auto r2 = unsignedAbs(std::numeric_limits<int16_t>::min()); r2 = uint16{32768} auto r3 = unsignedAbs(std::numeric_limits<int32_t>::min()); r3 = uint32{2147483648} auto r4 = unsignedAbs(std::numeric_limits<int64_t>::min()); r4 = uint64{9223372036854775808}

- Remember that there is
**one exceptional negative value**that cannot be converted into a positive one. - Implement
**a strategy**to deal with this situation:- Limit your number range to keep all calculations in a valid range.
- Switch from signed to unsigned integer values if you need to capture the full range of negative values.
- Always work with integers of defined size to stay in control of the side effects.

- Compiler warnings about overflows are serious, even if they sometimes point at the wrong part of your code.
- If undefined behaviour is involved, the code may work fine with one compiler or even one configuration (debug) but fail to be compiled by another compiler or in another configuration (release).
- Enable all warnings, e.g. with
`-Wall`

and`-Wextra`

. Never ignore warnings.

## Problems with Overflowing Operations

C++ has no protection in place to guarantee safe mathematical operations for integer types. Overflows in integer operations are silently ignored, leaving you in the best case with the lower part of the result or in the worst case causing undefined behaviour.

The range of representable values for the unsigned type is 0 to 2^{N} − 1 (inclusive); arithmetic for the unsigned type is performed modulo 2^{N}.

Unsigned arithmetic does not overflow. **Overflow for signed arithmetic yields undefined behavior.**

– from ISO/IEC 14882:2020, Types

### No Problems with Unsigned Integers

If you do math using an **unsigned integer**, if an operation overflows, you simply get the modulo for the current size of the type of the expected result. This works for additions, subtractions and multiplications, providing a predictable result for all these operations.

See the illustration above for the principle of modulo artithmetic. As the standard states, operations never overflow. If the result of an operation would generate bits outside of the size of the target integer, they are simply ignored.

The reason why C++ implements operations like this is for backward compatibility with C. And C implemented the operations like this to be as close as possible to the way how early CPUs worked.

A good illustration is how the add operation is compiled into machine code. Look at the simple function below, that just adds two unsigned 32-bit integer values.

#include <cstdint> auto add(uint32_t a, uint32_t b) -> uint32_t { return a + b; }

The generated machine code is death simple on every architecture:

add(unsigned int, unsigned int): push rbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov DWORD PTR [rbp-8], esi mov edx, DWORD PTR [rbp-4] mov eax, DWORD PTR [rbp-8] add eax, edx pop rbp ret

add(unsigned int, unsigned int): daddiu $sp,$sp,-32 sd $fp,24($sp) move $fp,$sp move $3,$4 move $2,$5 sll $3,$3,0 sw $3,0($fp) sll $2,$2,0 sw $2,4($fp) lw $3,0($fp) lw $2,4($fp) addu $2,$3,$2 move $sp,$fp ld $fp,24($sp) daddiu $sp,$sp,32 jr $31 nop

add(unsigned int, unsigned int): // @add(unsigned int, unsigned int) sub sp, sp, #16 str w0, [sp, #12] str w1, [sp, #8] ldr w8, [sp, #12] ldr w9, [sp, #8] add w0, w8, w9 add sp, sp, #16 ret

It simply uses the `add`

instruction to add the values of two registers.

### Unexpected Results with Signed Integers

In my experience short sentence “Overflow for signed arithmetic yields undefined behavior” is often overlooked by beginners. One of the problem is, that compilers *most of the time* produce code that behaves exactly like you would work with unsigned integers. Especially non-optimised code is compiled into simple `add`

instructions that generate the same predictable overflowing results.

Let’s try to expose undefined behaviour with the following example code.

### Project 05-overflow

#include <TypeInfo.hpp> #include <cmath> template<typename T> constexpr T badSaturatingSubtract(T a, T b) noexcept { // BAD CODE! Signed integer overflow is undefined. static_assert(std::is_integral_v<T>); if (b == 0) return a; const T result = a - b; if constexpr (std::is_signed_v<T>) { if ((result < a) == std::signbit(b)) { return std::signbit(b) ? std::numeric_limits<T>::max() : std::numeric_limits<T>::min(); } } else { if (result > a) { return 0; } } return result; } template<typename T> inline void doShadySaturatingMath(T a, T b) noexcept { auto result = badSaturatingSubtract(a, b); std::cout << "badSaturatingSubtract(" << TypeInfo(a).str() << ", " << TypeInfo(b).str() << ") = " << TypeInfo(result).str() << "\n"; } template<typename T> inline void doShadySaturatingMath() noexcept { auto a = std::numeric_limits<T>::min(); auto b = T{1}; doShadySaturatingMath(a, b); a = std::numeric_limits<T>::max(); b = T{-1}; doShadySaturatingMath(a, b); a = T{-100}; b = T{-500}; doShadySaturatingMath(a, b); } auto main() -> int { std::cout << "\nSigned Math 16-bit:\n"; doShadySaturatingMath<int16_t>(); std::cout << "\nSigned Math 32-bit:\n"; doShadySaturatingMath<int32_t>(); std::cout << "\nSigned Math 64-bit:\n"; doShadySaturatingMath<int64_t>(); return 0; }

The bad code is in line 10 (marked); the subtraction between two signed integers will overflow. Next, the result is tested if the result changes in the expected direction. If this isn’t the case, theoretically, an overflow would have been detected – but only if signed integers would work like unsigned ones.

I use std::signbit to test for negative values because this will enable additional optimisations and result in undefined behaviour in the clang compiler (Version 13.1.6 / clang-1316.0.21.2.5).

Signed Math 16-bit: badSaturatingSubtract(int16{-32768}, int16{1}) = int16{-32768} badSaturatingSubtract(int16{32767}, int16{-1}) = int16{32767} badSaturatingSubtract(int16{-100}, int16{-500}) = int16{400} Signed Math 32-bit: badSaturatingSubtract(int32{-2147483648}, int32{1}) = int32{-2147483648} badSaturatingSubtract(int32{2147483647}, int32{-1}) = int32{2147483647} badSaturatingSubtract(int32{-100}, int32{-500}) = int32{400} Signed Math 64-bit: badSaturatingSubtract(int64{-9223372036854775808}, int64{1}) = int64{-9223372036854775808} badSaturatingSubtract(int64{9223372036854775807}, int64{-1}) = int64{9223372036854775807} badSaturatingSubtract(int64{-100}, int64{-500}) = int64{400}

Signed Math 16-bit: badSaturatingSubtract(int16{-32768}, int16{1}) = int16{-32768} badSaturatingSubtract(int16{32767}, int16{-1}) = int16{32767} badSaturatingSubtract(int16{-100}, int16{-500}) = int16{400} Signed Math 32-bit: badSaturatingSubtract(int32{-2147483648}, int32{1}) = int32{2147483647} badSaturatingSubtract(int32{2147483647}, int32{-1}) = int32{-2147483648} badSaturatingSubtract(int32{-100}, int32{-500}) = int32{400} Signed Math 64-bit: badSaturatingSubtract(int64{-9223372036854775808}, int64{1}) = int64{9223372036854775807} badSaturatingSubtract(int64{9223372036854775807}, int64{-1}) = int64{-9223372036854775808} badSaturatingSubtract(int64{-100}, int64{-500}) = int64{400}

You can see the difference in the result between the debug and release build of the code. It only affects the operations where an overflow occurs.

- You can rely on the defined “overflow” behaviour of
**unsigned integers**. - But, overflow for
**signed arithmetic**yields undefined behaviour. **Never rely on a specific result if a signed integer overflows.**The effect may not be solely a wrong result; as demonstrated, the undefined behaviour can also affect following operations and lead to unpredictable behaviour.- Always write unit tests where possible. Compile and run the unit tests
**not only with debug settings**but*also optimized*with release settings. - If you do a lot of integer mathematics, work with saturating operations. All modern processors have instructions for this, generating almost no additional instructions.
- Enable all warnings, e.g. with
`-Wall`

and`-Wextra`

. Never ignore warnings.

## Comparisons with Unexpected Results

Integer comparison can be problematic if you mix signed and unsigned integers. This can happen by accident because literals without suffixes are signed by default.

uint32_t a = 0xf0000000u; if (a < 1) { ... };

This compares an `uint32_t`

with a signed `int`

. You need to know that C++ does not properly compare signed and unsigned types; instead, it first converts both sides into the same type before the comparison is made.

If both operands are of arithmetic or enumeration type, the usual arithmetic conversions are performed on both operands; (…)

– from ISO/IEC 14882:2020, Equality Operators

### Integer Promotions

For integral types, these conversions are called “integer promotions” and are performed on both operands of an arithmetic expression. The rules of integer promotions are shown below:

- If type A == type B, the types are used unchanged.
- If A is signed and B is signed, or A is unsigned, and B is unsigned, use one with the larger
*rank*for both types.`int16_t`

+`int64_t`

→`int64_t`

. - If A is signed and B is unsigned, or A is unsigned, and B is signed:
- If the unsigned type has an equal to or larger
*rank*than the signed type, use the unsigned type.`uint32_t`

+`int32_t`

→`uint32_t`

`uint64_t`

+`int32_t`

→`uint64_t`

- If the signed type has a larger
*rank*than the unsigned type and, therefore, can hold all values of the unsigned type, use the signed type.`uint16_t`

+`int32_t`

→`int32_t`

- If no other rule matches, an unsigned type of the
*size*of the signed type shall be used for both sides.`int64_t`

+`uint64_t`

→`uint64_t`

- If the unsigned type has an equal to or larger

The last rule seems to be redundant, but note the difference between rank and size. The *rank* of integers was mainly introduced to deal with different integer types of the same size.

### Rank Rules

- All chars have the same rank:
`char`

==`signed char`

==`unsigned char`

- Standard integers are ranked like this:
`signed char`

<`short int`

<`int`

<`long int`

<`long long int`

- Unsigned integers have the same rank as signed ones:
`unsigned X`

==`signed X`

- Standard integers must always have a higher rank than extended ones of the same size.

If`long long int`

is 128-bit wide, it has a higher rank than`__int128`

. - Bool shall have a lower rank than all other standard integers:
`bool`

<`char`

,`signed char`

,`unsigned char`

, etc.

With this knowledge, you can understand how the types of operands of an operation are converted before it is executed:

uint32_t a = 0xf0000000u; if (a < 1) { ... }; uint32_t{0xf0000000u} < (int{1} → uint32{1}) uint32_t{0xf0000000u} < uint32t{1} == false

### Project 06-comparison

Let’s look at some examples where the comparison seems to get the wrong result.

#include <TypeInfo.hpp> template<typename A, typename B> void printIsEqual(A a, B b) { const bool isEqual = (a == b); std::cout << TypeInfo(a).str() << " == " << TypeInfo(b).str() << " => " << std::boolalpha << isEqual << "\n"; } template<typename A, typename B> void printIsLess(A a, B b) { const bool isEqual = (a < b); std::cout << TypeInfo(a).str() << " < " << TypeInfo(b).str() << " => " << std::boolalpha << isEqual << "\n"; } void languageComparisons() { std::cout << "\nC++ Comparisons:\n"; printIsEqual(0, 0); printIsEqual(std::numeric_limits<int32_t>::min(), 0x80000000u); printIsEqual(-1, 0xffffffffu); printIsLess(-1, 0u); }

If I compile this code, I get several warnings about a comparison between signed and unsigned integers. But, the code is perfectly valid; it is not based on undefined behaviour. The compiler warns because the way how the integers are converted before the comparisons will give you most likely not the result you would expect:

C++ Comparisons: int32{0} == int32{0} => true int32{-2147483648} == uint32{2147483648} => true int32{-1} == uint32{4294967295} => true int32{-1} < uint32{0} => false

I added the first comparison to remind you that the literal value zero has a type. While, for comparison, the value zero is unproblematic, you may change the type of the result if you add `0`

.

The next lines show a few unexpected results, even if they are correct if all the rules from above are applied. In plain sight like this, they are easy to spot and understand, but if this comparison is part of a larger expression and you ignore the warning messages from the compiler, you may end up with strange side effects.

- Don’t compare signed with unsigned integers.
- Always remember that plain literal values, like
`0`

or`1`

are*signed*integers. Unless you add the suffix`u`

. - Enable all warnings, e.g. with
`-Wall`

and`-Wextra`

and never ignore warnings.

If you have to compare signed and unsigned integers in your code, the best is to write a comparison function that handles all cases as expected:

enum class [[nodiscard]] Ordering { Equal, Less, Greater }; template<typename A, typename B> constexpr auto compareInt(A a, B b) noexcept -> Ordering { static_assert(std::is_integral_v<A> && std::is_integral_v<B>); if constexpr (std::is_signed_v<A> == std::is_signed_v<B>) { using C = typename std::common_type_t<A, B>; const auto ca = static_cast<C>(a); const auto cb = static_cast<C>(b); return (ca <= cb) ? ((ca == cb) ? Ordering::Equal : Ordering::Greater) : Ordering::Less; } else if constexpr (std::is_signed_v<B>) { using C = typename std::common_type_t<A, std::make_unsigned_t<B>>; const auto ca = static_cast<C>(a); const auto cb = static_cast<C>(b); return (b < 0) ? Ordering::Less : ((ca <= cb) ? ((ca == cb) ? Ordering::Equal : Ordering::Greater) : Ordering::Less); } else { using C = std::common_type_t<std::make_unsigned_t<A>, B>; const auto ca = static_cast<C>(a); const auto cb = static_cast<C>(b); return (a < 0) ? Ordering::Greater : ((ca <= cb) ? ((ca == cb) ? Ordering::Equal : Ordering::Greater) : Ordering::Less); } } template<typename A, typename B> void printCompareInt(A a, B b) { const auto result = compareInt(a, b); std::cout << "compareInt(" << TypeInfo(a).str() << ", " << TypeInfo(b).str() << ") => "; switch (result) { case Ordering::Equal: std::cout << "Equal"; break; case Ordering::Less: std::cout << "Less"; break; case Ordering::Greater: std::cout << "Greater"; break; } std::cout << "\n"; } void comparisonsWithCompareInt() { std::cout << "\ncompareInt() Comparisons:\n"; printCompareInt(0, 0); printCompareInt(std::numeric_limits<int32_t>::min(), 0x80000000u); printCompareInt(-1, 0xffffffffu); printCompareInt(-1, 0u); }

Using this function in your code produces only very few additional CPU instructions, but you can safely compare every combination of integer types. The result indicates how `b`

is compared to `a`

. So `Ordering::Less`

means `b`

is smaller than `a`

. (Please note, the C++20 diamond operator gives: `a`

compared to `b`

as result.)

The result of this code looks like this:

compareInt() Comparisons: compareInt(int32{0}, int32{0}) => Equal compareInt(int32{-2147483648}, uint32{2147483648}) => Greater compareInt(int32{-1}, uint32{4294967295}) => Greater compareInt(int32{-1}, uint32{0}) => Greater

## Conclusion

I hope you found this journey to the abysses of the C++ language interesting. I’m aware this post merely points out all the traps you have to avoid and not many solutions on how to navigate around them. Yet, I am afraid adding various solutions for the individual topics would have added too much content to this overview.

If you have questions, missed any information, or simply wish to provide feedback, simply add a comment below or send me a message.