Skip to content

Serialization verification in ParentReferenceChecker introduced a performance regression in external tests #5274

@rolnico

Description

@rolnico

Current Behavior

When upgrading the dependency of our library powsybl-core from RDF4J v4.3.2 to newer versions (v4.3.3 and forward), we detected a consequent performance regression: the duration of our tests increased a lot. For example, on one of our modules that uses RDF4J:

Computation Duration on version 4.3.2 Duration on version 5.1.2 Difference
on local computer 56.409 s 01:45 min +87%
on GitHub 01:31 min 02:37 min +72%

(see powsybl/powsybl-core#3365 for more information about this issue).

After some investigation, I realized the difference in duration comes from the changes introduced in #4626 and especially the line verifySerializable(tupleExpr); added in the class ParentReferenceChecker.

This class ParentReferenceChecker is annotated with @InternalUseOnly so it should be only used during RDF4J tests (as it is written in its Javadoc) and not during our unit tests. The thing is, this class is used in StandardQueryOptimizerPipeline, which is the default pipeline used in DefaultEvaluationStrategy and DefaultEvaluationStrategyFactory. Therefore, if you use those default classes, you don't have the choice but to use the ParentReferenceChecker and have the duration difference when upgrading. The solution is then to add 3 classes extending the default/standard classes just to change the list of optimizers used in the EvaluationStrategy (see powsybl/powsybl-core#3361 for an example).

Expected Behavior

There are two ways we see things could be expected:

  • If you believe that the class ParentReferenceChecker has to be used even during external unit tests for some good reason, then maybe the call to verifySerializable(tupleExpr) should be optional
  • If there is no reason to call the optimizer ParentReferenceChecker in our tests, then the way it is added by default in testing environment should be changed in StandardQueryOptimizerPipeline. Maybe in your tests, a new class should extend StandardQueryOptimizerPipeline and add the optimizer, so that in external tests, we can use the StandardQueryOptimizerPipeline without the ParentReferenceChecker (at least for those who don't need it)?

Or maybe we use RDF4J the wrong way and we should be all using ParentReferenceChecker? 😄

Steps To Reproduce

  • Any environment (our tests run on Ubuntu/Windows/MacOS)
  • Java 17 (but it should not matter)
  • Powsybl-core v6.6.1 (or main branch)
  • RDF4J 4.3.3 or newer

And then run the unit tests for the module "CGMES conversion".

I'm not familiar enough with RDF4J to provide a test specific to only RDF4J.

Version

4.3.3 and newer (also tested on 4.1.15 and 5.1.2)

Are you interested in contributing a solution yourself?

Perhaps?

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixissue won't be fixed (close reason)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions