We adopt the Turing machine (TM) model to analyze query complexity. Difficulty is measured by the resources required - usually time and space - relative to the input size
- PTIME1: Problems solvable in polynomial time. This is the standard benchmark for tractability.
- LOGSPACE: Problems solvable using work tape space logarithmic to the input size. The input is kept on a read-only tape since the input itself is larger than the work tape.
-
PSPACE2: Problems solvable in polynomial space.
$PSPACE = NPSPACE$ by Savich's theorem.
A language
| Type of Completeness | Type of Reduction |
|---|---|
| P-completeness | LOGSPACE reductions |
| NP-completeness | PTIME reductions |
| PSPACE-completeness | PTIME reductions |
The two main possible ways to look at the complexity of queries are as follows:
- Data complexity: the complexity of evaluating a fixed query for variable database inputs
- Expression complexity: the complexity of evaluating, on a fixed database instance, the various queries specifiable in a given query language
The usual situation is that the size of the database inputs dominates by far the size of the query, so data complexity is typically most relevant.
Complexity is formally defined via the recognition problem: Given an instance
First-order logic queries3 are highly expressive.
- LOGSPACE inclusion: CALC queries are in QLOGSPACE. A TM can evaluate a query by using pointers4 to scan the input tape for each variable in the formula.
- Parallel complexity5: CALC queries can be evaluated in constant time given a polynomial number of processors. This is because projection is mapped to constant-depth circuits.
Transitive closure and other recursive properties can't be expressed in CALC. This necessitates Datalog, which uses fixpoint evaluation. We use the semi-naïve algorithm to compute the result of a recursive Datalog program. The goal is to avoid redundant computations by only using newly derived tuples6 in each subsequent iteration. For a rule
Form the precedence graph
The fact that each fixpoint query10 is in PTIME follows immediately from the inflationary11 nature of languages defining the fixpoint queries. The total number of tuples that can be built from constants in a given instance is polynomial in the size of the instance. The while queries with cumbersome updates are complete in PSPACE. A fundamental result is that fixpoint is complete in PTIME and that while is complete in PSPACE. These languages can't express simple queries12 if no order is provided.
We adopt a common notational convention for two parameters. Inside asymptotic notation, the symbol