Feature Request: Proactive server query killing based on Query cost or Strategy

### **Motivation**

Apache Pinot currently has [mechanisms to kill runaway queries based on CPU time and memory pressure](https://docs.pinot.apache.org/operate-pinot/tuning/oom-protection-using-automatic-query-killing) via `QueryResourceAggregator`. However, these mechanisms have limitations:

**Reactive, not proactive**: The background thread (QueryResourceAggregator) samples resource usage periodically (default 100ms). A query scanning billions of entries can cause significant damage before being detected.
**Delayed detection**: A full table scan query can process 100M+ entries between background thread samples, causing CPU spikes and impacting other queries before termination.
**Gap in protection**: The existing killMostExpensiveQuery() (memory-based) and killCPUTimeExceedQueries() (CPU-based) work well for memory-intensive and CPU-intensive queries respectively. However, a query doing a massive filter scan may not immediately trigger high memory usage but will consume significant CPU cycles processing entries that ultimately get filtered out.



**Problem** : In production environments with tables containing billions of rows, a single missing filter predicate or low-selectivity filter can trigger full segment scans. By the time the background thread detects high CPU usage, the query has already consumed significant resources and impacted cluster performance and increased tail latency for most other queries. 


To solve this problem, I would like to propose a Proactive Query Killing Framework that terminates queries inline based on scan thresholds, complementing the existing background-based CPU/memory killing. This framework will plug into existing query killing mechanism, just introduce a proactive way to track and terminate the queries instead of relying on overall JVM memory or CPU usage. The strategy will be made pluggable so it can be extended with more cost measurements in the future. 



### Core Components

`QueryCostContext` Interface: Exposes real-time query metrics:

```
getNumEntriesScannedInFilter()
getNumDocsScanned()
getElapsedTimeMs()
```


`QueryKillingStrategy` Interface: Pluggable strategy for termination decisions:

```
  interface QueryKillingStrategy {
      boolean shouldTerminate(QueryCostContext context);
      String getTerminationReason();
  }
```


**Configuration**

```
pinot.server.query.killer.enabled=true
pinot.server.query.killer.threshold.entriesScannedInFilter=100000000
pinot.server.query.killer.threshold.docsScanned=10000000
# Can also be overridden with table specific overrides

```



####**Non-Goals for now (Can be a future prospect)**

- Broker-side query killing (future work)
- Query fingerprinting / pattern-based blocking
- Adaptive thresholds based on cluster load


####**Similar design in other systems**

- ClickHouse provides comprehensive [query complexity settings](https://clickhouse.com/docs/operations/settings/query-complexity) that limit scan operations inline. eg. `max_rows_to_read`, `max_rows_to_read_leaf`, `read_overflow_mode`

- Trino provides `query.max-scan-physical-bytes` to limit data scanned:
> "The maximum number of bytes that can be scanned by a query during its execution. When this limit is reached, query processing is terminated to prevent excessive resource usage."


#### **Previous Work**
- [Runtime query killing on brokers and servers using Accounting](https://docs.google.com/document/d/1Z9DYAfKznHQI9Wn8BjTWZYTcNRVGiPP0B8aEP3w_1jQ/edit?tab=t.0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Proactive server query killing based on Query cost or Strategy #18043

Motivation

Core Components

Previous Work

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Proactive server query killing based on Query cost or Strategy #18043

Description

Motivation

Core Components

Previous Work

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions