|
| 1 | +# C++26 Reflection for JSON Serialization: A Practical Journey |
| 2 | + |
| 3 | +## The Problem: Every Developer's JSON Nightmare |
| 4 | + |
| 5 | +Imagine you're building a game server that needs to persist player data. You start simple: |
| 6 | + |
| 7 | +```cpp |
| 8 | +struct Player { |
| 9 | + std::string username; |
| 10 | + int level; |
| 11 | + double health; |
| 12 | + std::vector<std::string> inventory; |
| 13 | +}; |
| 14 | +``` |
| 15 | +
|
| 16 | +### The Traditional Approach: Manual Serialization |
| 17 | +
|
| 18 | +Without reflection, you write this tedious code: |
| 19 | +
|
| 20 | +```cpp |
| 21 | +// Serialization - converting Player to JSON |
| 22 | +std::string serialize_player(const Player& p) { |
| 23 | + std::stringstream ss; |
| 24 | + ss << "{"; |
| 25 | + ss << "\"username\":\"" << escape_json(p.username) << "\","; |
| 26 | + ss << "\"level\":" << p.level << ","; |
| 27 | + ss << "\"health\":" << p.health << ","; |
| 28 | + ss << "\"inventory\":["; |
| 29 | + for (size_t i = 0; i < p.inventory.size(); ++i) { |
| 30 | + if (i > 0) ss << ","; |
| 31 | + ss << "\"" << escape_json(p.inventory[i]) << "\""; |
| 32 | + } |
| 33 | + ss << "]"; |
| 34 | + ss << "}"; |
| 35 | + return ss.str(); |
| 36 | +} |
| 37 | +
|
| 38 | +// Deserialization - converting JSON back to Player |
| 39 | +simdjson::error_code deserialize_player(simdjson::ondemand::value& val, Player& p) { |
| 40 | + simdjson::ondemand::object obj; |
| 41 | + SIMDJSON_TRY(val.get_object().get(obj)); |
| 42 | +
|
| 43 | + SIMDJSON_TRY(obj["username"].get_string().get(p.username)); |
| 44 | + SIMDJSON_TRY(obj["level"].get_int64().get(p.level)); |
| 45 | + SIMDJSON_TRY(obj["health"].get_double().get(p.health)); |
| 46 | +
|
| 47 | + simdjson::ondemand::array arr; |
| 48 | + SIMDJSON_TRY(obj["inventory"].get_array().get(arr)); |
| 49 | + for (auto item : arr) { |
| 50 | + std::string_view sv; |
| 51 | + SIMDJSON_TRY(item.get_string().get(sv)); |
| 52 | + p.inventory.emplace_back(sv); |
| 53 | + } |
| 54 | +
|
| 55 | + return simdjson::SUCCESS; |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +### The Pain Points |
| 60 | + |
| 61 | +This manual approach has several problems: |
| 62 | + |
| 63 | +1. **Repetition**: Every field needs to be handled twice (serialize + deserialize) |
| 64 | +2. **Maintenance Nightmare**: Add a new field? Update both functions! |
| 65 | +3. **Error-Prone**: Typos in field names, forgotten fields, type mismatches |
| 66 | +4. **Boilerplate Explosion**: 30+ lines for a simple 4-field struct |
| 67 | + |
| 68 | +Now imagine your game grows: |
| 69 | + |
| 70 | +```cpp |
| 71 | +struct Equipment { |
| 72 | + std::string name; |
| 73 | + int damage; |
| 74 | + int durability; |
| 75 | +}; |
| 76 | + |
| 77 | +struct Achievement { |
| 78 | + std::string title; |
| 79 | + std::string description; |
| 80 | + bool unlocked; |
| 81 | + std::chrono::system_clock::time_point unlock_time; |
| 82 | +}; |
| 83 | + |
| 84 | +struct Player { |
| 85 | + std::string username; |
| 86 | + int level; |
| 87 | + double health; |
| 88 | + std::vector<std::string> inventory; |
| 89 | + std::map<std::string, Equipment> equipped; // New! |
| 90 | + std::vector<Achievement> achievements; // New! |
| 91 | + std::optional<std::string> guild_name; // New! |
| 92 | +}; |
| 93 | +``` |
| 94 | +
|
| 95 | +Suddenly you need to write hundreds of lines of serialization code! 😱 |
| 96 | +
|
| 97 | +## The Solution: C++26 Static Reflection |
| 98 | +
|
| 99 | +With C++26 reflection and simdjson, the same functionality becomes: |
| 100 | +
|
| 101 | +```cpp |
| 102 | +// That's it. Nothing else needed! |
| 103 | +struct Player { |
| 104 | + std::string username; |
| 105 | + int level; |
| 106 | + double health; |
| 107 | + std::vector<std::string> inventory; |
| 108 | + std::map<std::string, Equipment> equipped; |
| 109 | + std::vector<Achievement> achievements; |
| 110 | + std::optional<std::string> guild_name; |
| 111 | +}; |
| 112 | +
|
| 113 | +// Serialization - automatic! |
| 114 | +void save_player(const Player& p) { |
| 115 | + std::string json = simdjson::to_json(p); // That's it! |
| 116 | + // Save json to file... |
| 117 | +} |
| 118 | +
|
| 119 | +// Deserialization - automatic! |
| 120 | +Player load_player(const std::string& json_str) { |
| 121 | + return simdjson::from<Player>(json_str); // That's it! |
| 122 | +} |
| 123 | +``` |
| 124 | + |
| 125 | +Compare this to other modern languages: |
| 126 | + |
| 127 | +```python |
| 128 | +# Python |
| 129 | +import json |
| 130 | +json_str = json.dumps(player.__dict__) |
| 131 | +player = Player(**json.loads(json_str)) |
| 132 | +``` |
| 133 | + |
| 134 | +```rust |
| 135 | +// Rust with serde |
| 136 | +let json_str = serde_json::to_string(&player)?; |
| 137 | +let player: Player = serde_json::from_str(&json_str)?; |
| 138 | +``` |
| 139 | + |
| 140 | +```cpp |
| 141 | +// C++26 with simdjson - just as clean! |
| 142 | +std::string json_str = simdjson::to_json(player); |
| 143 | +Player player = simdjson::from<Player>(json_str); |
| 144 | +``` |
| 145 | + |
| 146 | +## How Does It Work? |
| 147 | + |
| 148 | +### The Key Insight: Compile-Time Code Generation |
| 149 | + |
| 150 | +A common question: "How can compile-time reflection handle runtime JSON data?" The answer is beautiful: reflection operates on **types and structure**, not runtime values. It generates regular C++ code at compile time that handles your runtime data. |
| 151 | + |
| 152 | +```cpp |
| 153 | +// What you write: |
| 154 | +Player p = simdjson::from<Player>(runtime_json_string); |
| 155 | + |
| 156 | +// What reflection generates at COMPILE TIME (conceptually): |
| 157 | +Player deserialize_Player(const json& j) { |
| 158 | + Player p; |
| 159 | + p.username = j["username"].get<std::string>(); |
| 160 | + p.level = j["level"].get<int>(); |
| 161 | + p.health = j["health"].get<double>(); |
| 162 | + p.inventory = j["inventory"].get<std::vector<std::string>>(); |
| 163 | + // ... etc for all members |
| 164 | + return p; |
| 165 | +} |
| 166 | +``` |
| 167 | +
|
| 168 | +### The Actual Reflection Magic |
| 169 | +
|
| 170 | +Here's a simplified view of what happens behind the scenes: |
| 171 | +
|
| 172 | +```cpp |
| 173 | +template <typename T> |
| 174 | + requires(std::is_class_v<T>) // For user-defined types |
| 175 | +error_code deserialize(auto& json_value, T& out) { |
| 176 | + simdjson::ondemand::object obj; |
| 177 | + SIMDJSON_TRY(json_value.get_object().get(obj)); |
| 178 | +
|
| 179 | + // This [:expand:] happens at COMPILE TIME |
| 180 | + // It literally generates code for each member |
| 181 | + [:expand(std::meta::nonstatic_data_members_of(^^T)):] >> [&]<auto member>() { |
| 182 | + // These are compile-time constants |
| 183 | + constexpr std::string_view field_name = std::meta::identifier_of(member); |
| 184 | + constexpr auto member_type = std::meta::type_of(member); |
| 185 | +
|
| 186 | + // This generates: out.username = obj["username"].get<std::string>() |
| 187 | + // or: out.level = obj["level"].get<int>() |
| 188 | + // etc. for each member |
| 189 | + auto err = obj[field_name].get(out.[:member:]); |
| 190 | + if (err && err != simdjson::NO_SUCH_FIELD) { |
| 191 | + return err; |
| 192 | + } |
| 193 | + }; |
| 194 | +
|
| 195 | + return simdjson::SUCCESS; |
| 196 | +} |
| 197 | +``` |
| 198 | + |
| 199 | +The `[:expand:]` statement is the key - it's like a compile-time for-loop that generates code for each struct member. By the time your program runs, all reflection has been "expanded" into normal C++ code. |
| 200 | + |
| 201 | +### Compile-Time vs Runtime: What Happens When |
| 202 | + |
| 203 | +```cpp |
| 204 | +struct Player { |
| 205 | + std::string username; // ← Compile-time: reflection sees this |
| 206 | + int level; // ← Compile-time: reflection sees this |
| 207 | + double health; // ← Compile-time: reflection sees this |
| 208 | +}; |
| 209 | + |
| 210 | +// COMPILE TIME: Reflection reads Player's structure and generates: |
| 211 | +// - Code to read "username" as string |
| 212 | +// - Code to read "level" as int |
| 213 | +// - Code to read "health" as double |
| 214 | + |
| 215 | +// RUNTIME: The generated code processes actual JSON data |
| 216 | +std::string json = R"({"username":"Alice","level":42,"health":100.0})"; |
| 217 | +Player p = simdjson::from<Player>(json); // Runtime values flow through compile-time generated code |
| 218 | +``` |
| 219 | +
|
| 220 | +### Compile-Time Safety: Catching Errors Before They Run |
| 221 | +
|
| 222 | +The beauty of reflection is that many errors are caught at compile time: |
| 223 | +
|
| 224 | +```cpp |
| 225 | +// ❌ COMPILE ERROR: Type mismatch detected |
| 226 | +struct BadPlayer { |
| 227 | + int username; // Oops, should be string! |
| 228 | +}; |
| 229 | +// simdjson::from<BadPlayer>(json) won't compile if JSON has string for username |
| 230 | +
|
| 231 | +// ❌ COMPILE ERROR: Non-serializable type |
| 232 | +struct InvalidType { |
| 233 | + std::thread t; // Threads can't be serialized! |
| 234 | +}; |
| 235 | +// simdjson::to_json(InvalidType{}) fails at compile time |
| 236 | +
|
| 237 | +// ❌ COMPILE ERROR: Private members (without friend access) |
| 238 | +class Secretive { |
| 239 | + int hidden_value; // Private! |
| 240 | +}; |
| 241 | +// Reflection can't access private members without proper access |
| 242 | +
|
| 243 | +// ✅ COMPILE SUCCESS: All types are serializable |
| 244 | +struct GoodType { |
| 245 | + std::vector<int> numbers; |
| 246 | + std::map<std::string, double> scores; |
| 247 | + std::optional<std::string> nickname; |
| 248 | +}; |
| 249 | +// All standard containers work automatically! |
| 250 | +``` |
| 251 | + |
| 252 | +### Zero Overhead: Why It's Fast |
| 253 | + |
| 254 | +Since reflection happens at compile time, there's no runtime penalty: |
| 255 | + |
| 256 | +1. **No runtime type inspection** - everything is known at compile time |
| 257 | +2. **No string comparisons for field names** - they become compile-time constants |
| 258 | +3. **Optimal code generation** - the compiler sees the full picture |
| 259 | +4. **Inline everything** - generated code can be fully optimized |
| 260 | + |
| 261 | +The generated code is often faster than hand-written code because: |
| 262 | +- It's consistently optimized |
| 263 | +- No human errors or inefficiencies |
| 264 | +- Leverages simdjson's SIMD parsing throughout |
| 265 | + |
| 266 | +## Performance: The Best Part |
| 267 | + |
| 268 | +You might think "automatic = slow", but with simdjson + reflection: |
| 269 | + |
| 270 | +- **Compile-time code generation**: No runtime overhead from reflection |
| 271 | +- **SIMD-accelerated parsing**: simdjson uses CPU vector instructions |
| 272 | +- **Zero allocation**: String views and in-place parsing |
| 273 | +- **Throughput**: ~2-4 GB/s on modern hardware |
| 274 | + |
| 275 | +The generated code is often *faster* than hand-written code because: |
| 276 | +1. Consistent optimization patterns |
| 277 | +2. No human errors (forgotten fields, inefficient loops) |
| 278 | +3. Leverages simdjson's optimized parsing strategies |
| 279 | + |
| 280 | +## Real-World Benefits |
| 281 | + |
| 282 | +### Before Reflection (Our Game Server example) |
| 283 | +- 1000+ lines of serialization code |
| 284 | +- Prone to bugs due to serialization mismatching |
| 285 | +- Adding new features can imply making tedious changes to boilerplace serialization code |
| 286 | + |
| 287 | +### After Reflection |
| 288 | +- 0 lines of serialization code |
| 289 | +- 0 serialization bugs (if it compiles, it works!) |
| 290 | +- New features can be added much faster |
| 291 | + |
| 292 | +## The Bigger Picture |
| 293 | + |
| 294 | +This pattern extends beyond games: |
| 295 | + |
| 296 | +- **REST APIs**: Automatic request/response serialization |
| 297 | +- **Configuration Files**: Type-safe config loading |
| 298 | +- **Message Queues**: Serialize/deserialize messages |
| 299 | +- **Databases**: Object-relational mapping |
| 300 | +- **RPC Systems**: Automatic protocol generation |
| 301 | + |
| 302 | +With C++26 reflection, C++ finally catches up to languages like Rust (serde), Go (encoding/json), and C# (System.Text.Json) in terms of ease of use, but with better performance thanks to simdjson's SIMD optimizations. |
| 303 | + |
| 304 | +## Try It Yourself |
| 305 | + |
| 306 | +This section is a good hook to tell about our work for using concepts to support containers, work to support optional. (TODO: We probably want to leave tag_invoke out of the presentation as it is far from a trivial concept) |
| 307 | + |
| 308 | +```cpp |
| 309 | +struct Meeting { |
| 310 | + std::string title; |
| 311 | + std::chrono::system_clock::time_point start_time; |
| 312 | + std::vector<std::string> attendees; |
| 313 | + std::optional<std::string> location; |
| 314 | + bool is_recurring; |
| 315 | +}; |
| 316 | + |
| 317 | +// Automatically serializable/deserializable! |
| 318 | +std::string json = simdjson::to_json(Meeting{ |
| 319 | + .title = "CppCon Planning", |
| 320 | + .start_time = std::chrono::system_clock::now(), |
| 321 | + .attendees = {"Alice", "Bob", "Charlie"}, |
| 322 | + .location = "Denver", |
| 323 | + .is_recurring = true |
| 324 | +}); |
| 325 | + |
| 326 | +Meeting m = simdjson::from<Meeting>(json); |
| 327 | +``` |
| 328 | +
|
| 329 | +Or even simpler - round-trip any data structure: |
| 330 | +
|
| 331 | +```cpp |
| 332 | +struct TodoItem { |
| 333 | + std::string task; |
| 334 | + bool completed; |
| 335 | + std::optional<std::string> due_date; |
| 336 | +}; |
| 337 | +
|
| 338 | +struct TodoList { |
| 339 | + std::string owner; |
| 340 | + std::vector<TodoItem> items; |
| 341 | + std::map<std::string, int> tags; // tag -> count |
| 342 | +}; |
| 343 | +
|
| 344 | +// Serialize complex nested structures |
| 345 | +TodoList my_todos = { /* ... */ }; |
| 346 | +std::string json = simdjson::to_json(my_todos); |
| 347 | +
|
| 348 | +// Deserialize back - perfect round-trip |
| 349 | +TodoList restored = simdjson::from<TodoList>(json); |
| 350 | +assert(my_todos == restored); // Works if you define operator== |
| 351 | +``` |
| 352 | + |
| 353 | +No macros. No code generation. No external tools. Just simdjson leveraging C++26 reflection. |
| 354 | + |
| 355 | +**The entire API surface:** |
| 356 | +- `simdjson::to_json(object)` → JSON string |
| 357 | +- `simdjson::from<T>(json)` → T object |
| 358 | + |
| 359 | +That's it. Two functions. Infinite possibilities. |
| 360 | + |
| 361 | +Welcome to the future of C++ serialization! 🚀 |
0 commit comments