Skip to content

Commit 2baf40a

Browse files
authored
Merge branch 'master' into refactor_datacache_session_switchvar
2 parents eb427cc + 5c560d4 commit 2baf40a

1,144 files changed

Lines changed: 1852 additions & 2715 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
---
2+
name: be-code-style
3+
description: Fix BE (C++) code formatting issues using clang-format
4+
compatibility: opencode
5+
---
6+
7+
## What I do
8+
9+
Fix C++ code formatting issues in the BE and Cloud modules using the project's clang-format configuration (v16).
10+
11+
## When to use me
12+
13+
- Before committing BE/Cloud C++ code changes
14+
- When CI reports clang-format failures
15+
- When you need to check or fix C++ code style
16+
17+
## Procedure
18+
19+
### Step 1: Auto-fix formatting
20+
21+
Run the project's formatting script, which enforces clang-format v16:
22+
23+
```bash
24+
build-support/clang-format.sh
25+
```
26+
27+
This formats all C++ files under `be/src`, `be/test`, `cloud/src`, `cloud/test` in-place, respecting `.clang-format` and `.clang-format-ignore`.
28+
29+
**Important**: Always use this script instead of invoking `clang-format` directly. The script checks that clang-format version 16 is installed and exits with an error if the wrong version is found. Using a different version will produce inconsistent formatting.
30+
31+
### Step 2: Check-only (no modification)
32+
33+
To verify formatting without modifying files:
34+
35+
```bash
36+
build-support/check-format.sh
37+
```
38+
39+
This outputs a diff of any formatting violations and exits non-zero if there are any.
40+
41+
### Step 3: Review and commit
42+
43+
After running `clang-format.sh`, review the changes with `git diff` to verify only formatting was changed, then stage and commit.
44+
45+
## Key Configuration
46+
47+
| File | Purpose |
48+
|------|---------|
49+
| `.clang-format` | Main formatting rules (Google-based, 100 col, 4-space indent) |
50+
| `.clang-format-ignore` | Files excluded from formatting (third-party, generated) |
51+
| `build-support/run_clang_format.py` | Python wrapper for parallel execution |
52+
53+
## Excluded Directories
54+
55+
The following are excluded from formatting (see `.clang-format-ignore`):
56+
- `be/src/apache-orc/*`, `be/src/clucene/*`, `be/src/gutil/*`
57+
- `be/src/glibc-compatibility/*`
58+
- Specific third-party vendored files (mustache, sse2neon, utf8_check)
59+
- `cloud/src/common/defer.h`
60+
61+
## Troubleshooting
62+
63+
| Problem | Solution |
64+
|---------|----------|
65+
| `clang-format not found` | Install clang-format v16 or set `CLANG_FORMAT_BINARY` env var |
66+
| `version is not 16` | Install clang-format v16; on macOS: `brew install llvm@16` |
67+
| Files not being formatted | Check `.clang-format-ignore` for exclusions |
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
---
2+
name: clang-tidy-check
3+
description: Run clang-tidy on newly added/modified BE C++ code
4+
compatibility: opencode
5+
---
6+
7+
## What I do
8+
9+
Run clang-tidy static analysis on newly added or modified C++ files in the BE and Cloud modules, using the project's `.clang-tidy` configuration. The script parses `git diff` to identify changed line ranges and filters clang-tidy output to diagnostics on those lines where possible, reducing noise from pre-existing code. Diagnostics from included headers that were not part of the diff are filtered out, though some edge cases may still appear.
10+
11+
## When to use me
12+
13+
- After building BE, before committing C++ code changes
14+
- When CI reports clang-tidy warnings on your PR
15+
- When you want to proactively check new code for common bugs and style issues
16+
17+
## Prerequisites
18+
19+
1. **The relevant module must be built first** — clang-tidy needs `compile_commands.json` generated during the CMake build. For BE files, build BE; for Cloud files, build Cloud.
20+
2. **clang-tidy must be installed** — the project uses clang-tidy from the LDB toolchain.
21+
22+
## Procedure
23+
24+
### Step 1: Build the relevant module (if not already done)
25+
26+
For **BE** files:
27+
```bash
28+
./build.sh --be -j${DORIS_PARALLELISM}
29+
```
30+
This generates `compile_commands.json` in `be/build_Release/` or `be/build_ASAN/`.
31+
32+
For **Cloud** files:
33+
```bash
34+
./build.sh --cloud -j${DORIS_PARALLELISM}
35+
```
36+
This generates `compile_commands.json` in `cloud/build_Release/` or `cloud/build_ASAN/`.
37+
38+
**Note**: A single compilation database typically covers only one module (BE or Cloud). If your changes span both modules, you may need to run clang-tidy twice with different `--build-dir` values.
39+
40+
### Step 2: Run clang-tidy on changed files
41+
42+
```bash
43+
build-support/run-clang-tidy.sh
44+
```
45+
46+
By default, this script:
47+
- Detects changed C++ files via `git diff` (uncommitted changes + staged changes)
48+
- Automatically finds `compile_commands.json` from the latest BE build
49+
- Runs clang-tidy using the `.clang-tidy` config
50+
- Skips files in excluded directories (third-party, generated code)
51+
- Reports warnings grouped by file
52+
53+
**Options:**
54+
55+
```bash
56+
# Compare against a specific branch (e.g., for PR review)
57+
build-support/run-clang-tidy.sh --base origin/master
58+
59+
# Check specific files (all lines, no line-level filtering)
60+
build-support/run-clang-tidy.sh --files be/src/vec/functions/my_new_func.cpp
61+
62+
# Report ALL warnings in changed files (no line filtering)
63+
build-support/run-clang-tidy.sh --full
64+
65+
# Specify build directory explicitly (required for Cloud if BE build dir was auto-detected)
66+
build-support/run-clang-tidy.sh --build-dir be/build_ASAN
67+
68+
# Check Cloud files with Cloud compilation database
69+
build-support/run-clang-tidy.sh --build-dir cloud/build_ASAN
70+
71+
# Apply auto-fixes where possible (note: fixes may apply beyond changed lines)
72+
build-support/run-clang-tidy.sh --fix
73+
```
74+
75+
**Important**: By default, only warnings on changed lines are reported (diagnostics from headers not in the diff are filtered out). This prevents the agent from "fixing" unrelated pre-existing warnings in large files. Use `--full` to see all warnings if needed.
76+
77+
### Step 3: Interpret and fix warnings
78+
79+
clang-tidy output looks like:
80+
81+
```
82+
be/src/vec/functions/foo.cpp:42:5: warning: use auto when initializing with a cast [modernize-use-auto]
83+
be/src/vec/functions/foo.cpp:55:1: warning: function 'process' exceeds recommended size [readability-function-size]
84+
```
85+
86+
**Fix approach:**
87+
1. Fix all warnings that represent real issues (bugs, performance, readability)
88+
2. For false positives or intentional patterns, add `// NOLINT(check-name)` with a brief justification
89+
3. Report any warnings you cannot reasonably fix
90+
91+
### Step 4: Verify
92+
93+
Re-run the script to confirm all warnings are resolved:
94+
95+
```bash
96+
build-support/run-clang-tidy.sh
97+
```
98+
99+
## Enabled Checks (from `.clang-tidy`)
100+
101+
| Category | Examples |
102+
|----------|----------|
103+
| `clang-diagnostic-*` | Compiler diagnostics |
104+
| `clang-analyzer-*` | Static analysis (null deref, use-after-free, etc.) |
105+
| `bugprone-*` | Use-after-move, redundant branch condition, unused RAII |
106+
| `modernize-*` | Auto, range-for, nullptr (excludes trailing return, nodiscard) |
107+
| `readability-*` | Function size (≤80 lines), cognitive complexity (≤50) |
108+
| `performance-*` | String find, inefficient algorithm, move-const-arg |
109+
| `misc-redundant-expression` | Redundant expressions |
110+
111+
## Excluded Directories
112+
113+
Files in these paths are automatically skipped:
114+
- `be/src/apache-orc/`, `be/src/clucene/`, `be/src/gutil/`
115+
- `be/src/glibc-compatibility/`
116+
- `contrib/` (all third-party code)
117+
- Generated code directories
118+
119+
## NOLINT Usage
120+
121+
When a clang-tidy warning cannot be fixed, suppress it with a comment:
122+
123+
```cpp
124+
// Good: specific check name + reason
125+
int x = (int)y; // NOLINT(modernize-use-auto): explicit cast for clarity
126+
127+
// Bad: blanket suppression without reason
128+
int x = (int)y; // NOLINT
129+
```
130+
131+
## Troubleshooting
132+
133+
| Problem | Solution |
134+
|---------|----------|
135+
| `compile_commands.json not found` | Build BE first: `./build.sh --be -j${DORIS_PARALLELISM}` |
136+
| `clang-tidy not found` | Install via LDB toolchain or `apt install clang-tidy-16` |
137+
| Too many warnings on old code | Use `--base` to diff against a branch, ensuring only your changes are checked |
138+
| Warning in header included by your file | Only fix if the header is also in your changeset; otherwise note it in your report |
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
name: fe-code-style
3+
description: Fix FE (Java) code style issues using Checkstyle
4+
compatibility: opencode
5+
---
6+
7+
## What I do
8+
9+
Diagnose and fix Java code style issues in the FE module using the project's Checkstyle configuration.
10+
11+
## When to use me
12+
13+
- Before committing FE Java code changes
14+
- When FE build fails due to checkstyle violations
15+
- When you need to understand FE style rules
16+
17+
## Procedure
18+
19+
### Step 1: Run checkstyle
20+
21+
Checkstyle is integrated into the Maven build and runs automatically during `mvn validate`. To check style only (without full compilation):
22+
23+
```bash
24+
cd fe && mvn checkstyle:check -pl fe-core
25+
```
26+
27+
Or as part of the normal build:
28+
29+
```bash
30+
./build.sh --fe -j${DORIS_PARALLELISM}
31+
```
32+
33+
If checkstyle fails, the build will fail with error messages showing the file, line number, and rule violated.
34+
35+
### Step 2: Interpret violations
36+
37+
Checkstyle output looks like:
38+
39+
```
40+
[ERROR] src/main/java/.../Foo.java:[42:5] (imports) UnusedImports: Unused import - java.util.List.
41+
[ERROR] src/main/java/.../Bar.java:[10] (header) RegexpHeader: Line does not match expected header line...
42+
```
43+
44+
Each error shows: `[file]:[line:col] (category) RuleName: description`.
45+
46+
### Step 3: Fix common violations
47+
48+
| Violation | Fix |
49+
|-----------|-----|
50+
| `RegexpHeader` | Add/fix Apache License header at file top |
51+
| `UnusedImports` | Remove unused import statements |
52+
| `LineLength` (>120 chars) | Break long lines |
53+
| `IllegalImport` (shaded classes) | Use the non-shaded equivalent (see `import-control.xml`) |
54+
| `FileTabCharacter` | Replace tabs with spaces |
55+
| `NewlineAtEndOfFile` | Ensure file ends with a newline |
56+
| `MergeConflictMarker` | Resolve git merge conflicts |
57+
58+
### Step 4: Verify fix
59+
60+
After fixing, re-run checkstyle to confirm:
61+
62+
```bash
63+
cd fe && mvn checkstyle:check -pl fe-core
64+
```
65+
66+
## Key Configuration
67+
68+
| File | Purpose |
69+
|------|---------|
70+
| `fe/check/checkstyle/checkstyle.xml` | Main rules (license header, line length 120, encoding, imports) |
71+
| `fe/check/checkstyle/suppressions.xml` | Rule exclusions (test files, nereids, large files) |
72+
| `fe/check/checkstyle/import-control.xml` | Import restrictions (no shaded classes, no old logging APIs, no Lombok in nereids) |
73+
| `fe/check/checkstyle/checkstyle-apache-header.txt` | Apache License 2.0 header template |
74+
| `build-support/IntelliJ-code-format.xml` | IntelliJ formatter scheme (120 col, import organization) |
75+
76+
## Suppression Rules
77+
78+
Some files/packages have relaxed rules (see `suppressions.xml`):
79+
- Test files: Javadoc rules suppressed
80+
- Non-nereids code: Some strict rules suppressed
81+
- Specific large files: Individual suppressions
82+
83+
## Excluded Code
84+
85+
The following are excluded from checkstyle (see `fe/pom.xml`):
86+
- `**/apache/doris/thrift/**/*` (generated Thrift code)
87+
- `**/apache/parquet/**/*` (generated Parquet code)
88+
89+
## Import Restrictions (key rules from import-control.xml)
90+
91+
- **No shaded imports**: Do not use `org.apache.doris.thirdparty.*` directly
92+
- **No old logging**: Do not use `org.apache.commons.logging` or `java.util.logging`
93+
- **No SimpleDateFormat**: Use `java.time` APIs instead
94+
- **No Lombok in nereids**: `lombok` is disallowed in `org.apache.doris.nereids` package
95+
96+
## Troubleshooting
97+
98+
| Problem | Solution |
99+
|---------|----------|
100+
| Build fails on `validate` phase | Run `mvn checkstyle:check -pl fe-core` to see specific violations |
101+
| Can't find checkstyle config | Ensure you're running from the `fe/` directory |
102+
| IntelliJ formatting differs | Import `build-support/IntelliJ-code-format.xml` via Settings → Editor → Code Style |

AGENTS.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,16 @@ When adding code, strictly follow existing similar code in similar contexts, inc
1616

1717
After adding code, you must first conduct self-review and refactoring attempts to ensure good abstraction and reuse as much as possible.
1818

19+
### Code Style Enforcement
20+
21+
All code must pass style checks before committing. Use the corresponding skill for detailed step-by-step procedures.
22+
23+
**BE (C++) Formatting**: Run `build-support/clang-format.sh` to auto-fix formatting. This script enforces clang-format v16; do not use other versions. Run `build-support/check-format.sh` to check without modifying files. See the `be-code-style` skill for details.
24+
25+
**BE (C++) Static Analysis**: After building BE (which generates `compile_commands.json`), run `build-support/run-clang-tidy.sh` to check modified C++ files against the `.clang-tidy` config. The script parses `git diff` to filter warnings to changed lines where possible, reducing noise from pre-existing code (diagnostics from included headers may still appear). For Cloud C++ files, pass `--build-dir` pointing to the Cloud compilation database (e.g., `cloud/build_ASAN`). Try to fix all reported warnings; if a warning cannot be reasonably fixed, add a `// NOLINT` comment with justification and report it. See the `clang-tidy-check` skill for details.
26+
27+
**FE (Java) Style**: Checkstyle is integrated into the Maven build (`maven-checkstyle-plugin`). Running `build.sh --fe` automatically validates style via `mvn validate`. If checkstyle fails, fix the reported issues according to `fe/check/checkstyle/checkstyle.xml`. See the `fe-code-style` skill for details.
28+
1929
## Code Review
2030

2131
When conducting code review (including self-review and review tasks), it is necessary to complete the key checkpoints according to our `code-review` skill and provide conclusions for each key checkpoint (if applicable) as part of the final written description. Other content does not require individual responses; just check them during the review process.

be/src/agent/cgroup_cpu_ctl.cpp

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@
2828

2929
namespace doris {
3030

31-
#include "common/compile_check_begin.h"
32-
3331
bool CgroupCpuCtl::is_a_valid_cgroup_path(std::string cg_path) {
3432
if (!cg_path.empty()) {
3533
if (cg_path.back() != '/') {
@@ -445,6 +443,4 @@ Status CgroupV2CpuCtl::add_thread_to_cgroup() {
445443
return CgroupCpuCtl::add_thread_to_cgroup(_cgroup_v2_query_wg_thread_file);
446444
}
447445

448-
#include "common/compile_check_end.h"
449-
450446
} // namespace doris

be/src/agent/task_worker_pool.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,6 @@
104104
#include "util/trace.h"
105105

106106
namespace doris {
107-
#include "common/compile_check_begin.h"
108107
using namespace ErrorCode;
109108

110109
namespace {
@@ -1775,7 +1774,9 @@ void create_tablet_callback(StorageEngine& engine, const TAgentTaskRequest& req)
17751774
Defer defer = [&] {
17761775
auto elapsed_time = static_cast<double>(watch.elapsed_time());
17771776
if (elapsed_time / 1e9 > config::agent_task_trace_threshold_sec) {
1777+
#include "common/compile_check_avoid_begin.h"
17781778
COUNTER_UPDATE(profile->total_time_counter(), elapsed_time);
1779+
#include "common/compile_check_avoid_end.h"
17791780
std::stringstream ss;
17801781
profile->pretty_print(&ss);
17811782
LOG(WARNING) << "create tablet cost(s) " << elapsed_time / 1e9 << std::endl << ss.str();
@@ -2527,5 +2528,4 @@ void report_index_policy_callback(const ClusterInfo* cluster_info) {
25272528
}
25282529
}
25292530

2530-
#include "common/compile_check_end.h"
25312531
} // namespace doris

be/src/agent/workload_group_listener.cpp

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,6 @@
2727

2828
namespace doris {
2929

30-
#include "common/compile_check_begin.h"
31-
3230
void WorkloadGroupListener::handle_topic_info(const std::vector<TopicInfo>& topic_info_list) {
3331
std::set<uint64_t> current_wg_ids;
3432
bool is_set_workload_group_info = false;
@@ -87,5 +85,4 @@ void WorkloadGroupListener::handle_topic_info(const std::vector<TopicInfo>& topi
8785
_exec_env->workload_group_mgr()->delete_workload_group_by_ids(current_wg_ids);
8886
}
8987

90-
#include "common/compile_check_end.h"
9188
} // namespace doris

be/src/cloud/cloud_committed_rs_mgr.cpp

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
#include "util/thread.h"
2626

2727
namespace doris {
28-
#include "common/compile_check_begin.h"
2928
CloudCommittedRSMgr::CloudCommittedRSMgr() : _stop_latch(1) {}
3029

3130
CloudCommittedRSMgr::~CloudCommittedRSMgr() {
@@ -138,5 +137,4 @@ void CloudCommittedRSMgr::_clean_thread_callback() {
138137
} while (!_stop_latch.wait_for(
139138
std::chrono::seconds(config::remove_expired_tablet_txn_info_interval_seconds)));
140139
}
141-
#include "common/compile_check_end.h"
142140
} // namespace doris

0 commit comments

Comments
 (0)