Cool C++ Features and Weird Details
Some useful and cursed C++ features
C++ has a lot of small features that are either useful, cursed, or both.
This post is just a collection of things I want to remember.
Array Indexing is Symmetric
1
2
3
4
int arr[5] = {1, 2, 3, 4, 5};
printf("%d\n", arr[3]); // 4
printf("%d\n", 3[arr]); // 4
This works because:
1
arr[3]
is defined as:
1
*(arr + 3)
And:
1
3[arr]
is defined as:
1
*(3 + arr)
Pointer addition is commutative here, so both access the same element.
This is valid C++, but please do not write
3[arr]unless you are trying to summon demons.
Struct Alignment and Padding
Consider this struct:
1
2
3
4
5
6
7
struct A {
char a; // 1 byte
int b; // 4 bytes, wants 4-byte alignment
char c; // 1 byte
};
printf("%zu\n", sizeof(A)); // usually 12
The size is not just:
1
1 + 4 + 1 = 6
because members need to satisfy alignment requirements.
A typical layout is:
1
2
a padding b b b b c padding
1 byte 3 bytes 4 bytes 1 byte 3 bytes
So the total becomes 12 bytes.
The compiler also pads the end of the struct so that arrays work correctly:
1
A arr[10];
Each A object must still have proper alignment.
Reordering Members
We can reduce padding by grouping smaller members together:
1
2
3
4
5
6
7
struct B {
char a; // 1 byte
char c; // 1 byte
int b; // 4 bytes
};
printf("%zu\n", sizeof(B)); // usually 8
Typical layout:
1
2
a c padding b b b b
1 byte 1 byte 2 bytes 4 bytes
So B is smaller than A.
Struct member order can affect memory usage. This matters more when you store millions of objects.
Compiler Optimizations
1
2
int x = 5;
int y = x * 2;
The compiler may optimize this to:
1
int y = 10;
because x * 2 can be known at compile time.
This kind of optimization is called constant folding.
Of course, real compilers do way more than this:
- remove unused code
- inline functions
- simplify expressions
- unroll loops
- vectorize loops
The important idea: C++ source code is not a literal list of CPU instructions. The optimizer is allowed to transform your code as long as the observable behavior stays the same.
Unsafe C Library Functions
Some old C functions are very unsafe if used carelessly.
strcpy
1
2
3
char buffer[10];
strcpy(buffer, "This is a long string that exceeds the buffer size!");
strcpy does not check whether the destination buffer is large enough.
If the source string is too long, it writes past the end of the array.
That is a buffer overflow.
Prefer C++ types when possible:
1
std::string s = "This is safe";
atoi
1
2
char str[] = "xyz";
int num = atoi(str);
atoi gives poor error handling. If the input is invalid, it just returns 0, which is ambiguous.
Better alternatives:
1
std::stoi("123");
or, for low-level parsing:
1
std::from_chars(...);
Old C APIs are powerful, but many of them trust the programmer way too much.
Digraphs and Trigraphs
C++ has alternative spellings for some symbols.
| Symbol | Digraph | Trigraph |
|---|---|---|
{ | <% | ??< |
} | %> | ??> |
[ | <: | ??( |
] | :> | ??) |
# | %: | ??= |
Digraph example:
1
2
3
4
5
6
%:include <iostream>
int main() <%
int a<:3:> = {1, 2, 3};
return 0;
%>
This is equivalent to:
1
2
3
4
5
6
#include <iostream>
int main() {
int a[3] = {1, 2, 3};
return 0;
}
Trigraphs existed for old systems where some characters were hard to type.
Digraphs still exist. Trigraphs were removed in C++17. Either way, do not use them unless you enjoy cursed archaeology.
main is Not the Real Start
We usually think the program starts here:
1
2
3
int main(int argc, char** argv) {
// ...
}
But the operating system does not directly “start C++” from main.
A lower-level entry point, often called _start, runs first.
Conceptually:
1
2
3
4
5
void _start() {
setup_runtime();
int result = main(argc, argv);
exit(result);
}
Before main, the runtime may:
- set up stack/environment
- initialize global/static objects
- initialize libc / C++ runtime
- prepare
argcandargv
After main, it also:
- destroys static objects
- flushes streams
- exits the process
mainis the C++ entry point._startis closer to the real OS-level entry point.
A Byte is Not Always 8 Bits
In C++:
1
sizeof(char) == 1
is always true.
But this means:
1
sizeof(char) == 1 byte
not necessarily:
1
1 byte == 8 bits
The number of bits in a byte is given by:
1
2
3
#include <climits>
CHAR_BIT
On almost all modern machines:
1
CHAR_BIT == 8
But the C++ standard does not require this.
In normal competitive programming and desktop programming, assuming 8-bit bytes is fine. But technically, C++ only guarantees
sizeof(char) == 1.
Integer Literal Prefixes
1
2
3
auto binary = 0b1010; // binary, 10
auto octal = 012; // octal, 10
auto hex = 0xA; // hexadecimal, 10
Integer literal prefixes:
| Prefix | Base | Example |
|---|---|---|
0b / 0B | 2 | 0b1010 |
leading 0 | 8 | 012 |
0x / 0X | 16 | 0xA |
The octal one is the most dangerous.
1
int x = 010; // 8, not 10
Leading zero means octal. This is one of the most annoying C/C++ legacy traps.
Recursive Lambda with Deducing this (C++23)
Before C++23, recursive lambdas often needed tricks like y_combinator or passing self manually.
In C++23, we can write:
1
2
3
4
5
6
7
auto dfs = [&](this auto&& self, int u) -> void {
for (int v : graph[u]) {
self(v);
}
};
dfs(0);
Here, self refers to the lambda itself.
This makes recursive lambdas much cleaner.
Older style:
1
2
3
4
5
6
7
auto dfs = [&](auto&& self, int u) -> void {
for (int v : graph[u]) {
self(self, v);
}
};
dfs(dfs, 0);
C++23 version removes the annoying extra self(self, ...).
This is very nice for DFS-style code, but online judges may not support C++23 yet.
Three-Way Comparison <=> (C++20)
The spaceship operator can generate comparisons automatically.
1
2
3
4
5
6
7
#include <compare>
struct Node {
int x, y, id;
auto operator<=>(const Node&) const = default;
};
This compares members in declaration order:
1
x first, then y, then id
With = default, C++ can generate comparison operators for us.
Custom Ordering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <compare>
struct Point {
int x, y;
std::strong_ordering operator<=>(const Point& other) const {
if (auto cmp = x <=> other.x; cmp != 0) {
return cmp;
}
return other.y <=> y; // y descending
}
bool operator==(const Point& other) const = default;
};
This sorts by:
1
2
x ascending
y descending
If you write custom
<=>, also default or defineoperator==.
ranges::sort with Projection (C++20)
Normally, to sort by .second:
1
2
3
sort(a.begin(), a.end(), [](auto const& x, auto const& y) {
return x.second < y.second;
});
With ranges projection:
1
2
3
ranges::sort(a, {}, [](auto const& p) {
return p.second;
});
The middle {} means “use the default comparator”.
So this means:
1
sort by projected key p.second
For structs, member pointer projection is even cleaner:
1
2
3
4
5
6
7
struct Edge {
int u, v, w;
};
vector<Edge> e;
ranges::sort(e, {}, &Edge::w);
This sorts edges by weight.
Use
auto const& pin the projection if the element is large.auto pcopies the element.
if / switch Initializer (C++17)
C++17 lets us declare a variable inside an if condition:
1
2
3
if (auto it = mp.find(x); it != mp.end()) {
cout << it->second << '\n';
}
The variable it only exists inside the if / else statement.
This avoids leaking temporary variables into the outer scope.
Equivalent older style:
1
2
3
4
auto it = mp.find(x);
if (it != mp.end()) {
cout << it->second << '\n';
}
Use case:
1
2
3
if (auto [it, ok] = st.insert(x); ok) {
// inserted successfully
}
This is useful when the variable is only needed for the condition.
<bit> Utilities (C++20)
C++20 added useful bit functions in <bit>.
1
2
3
4
5
#include <bit>
std::popcount(x);
std::countl_zero(x);
std::countr_zero(x);
Common ones:
| Function | Meaning |
|---|---|
std::popcount(x) | number of set bits |
std::countl_zero(x) | leading zero bits |
std::countr_zero(x) | trailing zero bits |
std::has_single_bit(x) | whether x is a power of two |
std::bit_width(x) | number of bits needed to represent x |
Example:
1
2
3
4
unsigned x = 12; // 1100
std::popcount(x); // 2
std::countr_zero(x); // 2
These functions are safer than compiler builtins like:
1
2
__builtin_clz(x);
__builtin_ctz(x);
because the standard functions are well-defined for 0.
1
std::countr_zero(0u); // OK
But builtins like __builtin_ctz(0) are undefined behavior.
These functions work on unsigned integer types. Prefer unsigned values when doing bit tricks.
Summary
Useful features:
if (init; condition)keeps temporary variables scoped.<bit>gives safe standard bit operations.<=>reduces comparison boilerplate.ranges::sortprojections make sorting by key cleaner.- C++23 deducing
thismakes recursive lambdas nicer.
Cursed but useful details:
arr[i]andi[arr]are equivalent.- struct padding can change object size.
mainis not the true low-level entry point.sizeof(char) == 1does not mean one byte is always 8 bits.- leading
0means octal.