Skip to content

task(Migrate ESMappingAPIImpl) Refs #34933#35289

Open
fabrizzio-dotCMS wants to merge 13 commits intomainfrom
issue-34933-Mapping-Layer
Open

task(Migrate ESMappingAPIImpl) Refs #34933#35289
fabrizzio-dotCMS wants to merge 13 commits intomainfrom
issue-34933-Mapping-Layer

Conversation

@fabrizzio-dotCMS
Copy link
Copy Markdown
Member

@fabrizzio-dotCMS fabrizzio-dotCMS commented Apr 10, 2026

Summary

Migrates ContentletIndexAPIImpl — the central content indexing implementation — to the vendor-neutral Phase-aware router pattern. All Elasticsearch-specific types (BulkRequest, BulkProcessor, ActionListener) are replaced with domain-layer abstractions (IndexBulkRequest, IndexBulkProcessor, IndexBulkListener), enabling dual-write routing to both ES and OS backends during the ES→OS migration without leaking vendor types to callers.

Changes

Backend — Core Indexing

  • ContentletIndexAPI (interface): Removed all org.elasticsearch.* imports from method signatures; replaced with vendor-neutral IndexBulkRequest, IndexBulkProcessor, and IndexBulkListener domain types. Added @IndexLibraryIndependent annotation. Changed fullReindexStart() return type from String to IndexStartResult. Added thread-safe DateTimeFormatter alongside deprecated SimpleDateFormat.
  • ContentletIndexAPIImpl: Full rewrite to a phase-aware router:
    • Holds two ContentletIndexOperations instances (operationsES, operationsOS) and delegates via PhaseRouter<ContentletIndexOperations>
    • New DualIndexBulkRequest inner class fans out synchronous bulk batches to both ES and OS in dual-write phases
    • New CompositeBulkProcessor inner class fans out async bulk processing; OS failures are fire-and-forget (shadow) in phases 1/2, propagating only in phase 3
    • ProviderIndices inner class resolves working/live/reindex index names per provider, stripping os:: vendor tags before passing to the OS client
    • Lazy-initialized ContentMappingAPI via AtomicReference to break the circular dependency chain
    • Dropped all direct org.elasticsearch.* imports from the implementation
  • IndexTag: New utility for stripping vendor-specific index name prefixes (os::)
  • IndexStartResult: New domain value object replacing the raw String returned by fullReindexStart()
  • BulkProcessorListener: Updated to implement IndexBulkListener instead of the ES-specific listener
  • ReindexThread: Minor updates to use IndexBulkProcessor/IndexBulkRequest instead of ES types
  • MappingHelper: Replaces ESMappingUtilHelper as the vendor-neutral mapping utility
  • ContentletIndexOperationsES / ContentletIndexOperationsOS: Updated to expose the new domain type APIs

Testing

  • ContentletIndexAPIImplTest: Updated to use the new domain types and refactored test constructors
  • ESSiteSearchAPITest, EMAWebInterceptorTest, CleanUpFieldReferencesJobTest: Minor compile fixes following the API changes

Testing

  1. Start the integration test environment: just test-integration-ide
  2. Run the primary integration test:
    ./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false -Dit.test=ContentletIndexAPIImplTest
    
  3. Verify reindexing still works end-to-end by triggering a full reindex from the dotCMS Admin → System → Reindex
  4. Check that ReindexThread completes without errors in both Phase 0 (ES only) and Phase 1 (dual-write) configurations

Breaking Changes

  • ContentletIndexAPI.fullReindexStart() now returns IndexStartResult instead of String. Callers previously using the raw timestamp string must call .timestampSuffix() on the result.
  • BulkRequest, BulkProcessor, and ActionListener<BulkResponse> removed from the ContentletIndexAPI interface. Any code directly casting or importing these types against the interface will not compile.

This PR fixes: #34933

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 10, 2026

Claude finished @fabrizzio-dotCMS's task in 4m 26s —— View job


PR Review: task(Migrate ESMappingAPIImpl) Refs #34933

  • Gather context and read changed files
  • Analyze code
  • Post findings

The architecture (PhaseRouter, CompositeBulkProcessor, DualIndexBulkRequest) is sound. Below are concrete issues found, ordered by severity.


Bugs

1. Thread-safety violation — SimpleDateFormat shared across threads

ContentletIndexAPIImpl.java:1245:

final Date startTime = IndiciesInfo.timestampFormatter.parse(ts);

IndiciesInfo.timestampFormatter is a static final SimpleDateFormat — not thread-safe. elapsedSinceIndexCreated() is called from reindexTimeElapsedInLong()fullReindexSwitchover(), which has no synchronization barrier. Under concurrent reindex loads this will corrupt the parse result or throw NumberFormatException. The fix is already present: use ContentletIndexAPI.threadSafeTimestampFormatter with LocalDateTime. Fix this →

2. Wrong cluster prefix applied in getIndexDocumentCount(String, IndexTag)

ContentletIndexAPIImpl.java:2530–2536:

long getIndexDocumentCount(final String indexName, final IndexTag tag) {
    final String name = indexAPI.getNameWithClusterIDPrefix(indexName); // ← phase-aware: ES in Phase 1
    ContentletIndexOperations operations = router.esImpl();
    if (tag == IndexTag.OS) {
        operations = router.osImpl();
    }
    return operations.getIndexDocumentCount(name); // ← OS client gets an ES-prefixed name
}

isOsWorkingIndexEmpty() calls this with a raw OS index name from versionedIndicesAPI and IndexTag.OS. In Phase 1, indexAPI.getNameWithClusterIDPrefix() is delegated to the ES read provider and applies an ES cluster prefix to an OS index name. The OS client then queries a non-existent index. isOsWorkingIndexEmpty() is called from hasEmptyIndices()reindexIfNoIndicesFound() — a false positive here triggers a spurious full reindex on startup. Should call router.osImpl().toPhysicalName(indexName) instead of indexAPI.getNameWithClusterIDPrefix() when tag == IndexTag.OS.

3. Misleading log message in putToIndex()

ContentletIndexAPIImpl.java:1452–1458:

} catch (final Exception e) {
    Logger.warnAndDebug(this.getClass(),
            "OS shadow write failed in putToIndex — ES write succeeded; ...");
}

The "ES write succeeded" text is hardcoded. It appears even when ES has thrown and esException is non-null — so operators reading logs after a dual failure will be misled. Guard the message or make it conditional on esException == null. Fix this →


Design / Risk

4. createContentIndex(String, int) silently swallows primary failure

ContentletIndexAPIImpl.java:554–567:

for (final ContentletIndexOperations ops : router.writeProviders()) {
    try {
        result &= ops.createContentIndex(physicalName, shards);
    } catch (Exception e) {
        Logger.error(...);
        result = false;
    }
}

Both ES (primary) and OS (shadow) exceptions are treated identically: logged, then silently absorbed. The public method createContentIndex(String, int) returns false to the caller who has no way to distinguish "ES hard failure" from "OS shadow drift". Consistent with the PhaseRouter contract, ES failures should be rethrown; OS failures in dual-write should be swallowed.

5. Downcast to IndexAPIImpl in readiness checks

ContentletIndexAPIImpl.java:422 and 461:

final IndexAPIImpl impl = (IndexAPIImpl) indexAPI;

The indexAPI field is declared as IndexAPI. The cast works only because APILocator.getESIndexAPI() returns new IndexAPIImpl(). If indexAPI is ever injected via the test constructor (ContentletIndexAPIImpl(ContentletIndexOperations, ContentletIndexOperations)) with a mock, this throws ClassCastException. indexReadyES() is already @VisibleForTesting which increases the probability that test authors will hit this. Consider exposing esImpl() / osImpl() directly on IndexAPI or accepting the concrete type in the constructor.

6. Stale comment contradicts implementation

ContentletIndexAPIImpl.java:1777–1778:

Dual-write via the processor path is not yet supported;
full dual-write is handled by the synchronous bulk-request path instead.

This is directly contradicted by createBulkProcessor() (line 1513), which creates a CompositeBulkProcessor with one entry per write provider, and addBulkRequestToProcessor() (line 1791), which calls resolveProcessorTargets() and fans out to all entries. Dual-write via the processor path is implemented; the comment is misleading for anyone trying to understand the code.

7. OSGi API breakage (flagged twice previously — still unaddressed)

The two prior automated reviews (April 10 and April 11) both flagged ContentletIndexAPI interface changes and BulkProcessorListener contract changes as M-4 risk. This PR still removes checkAndInitialiazeIndex(), putToIndex(BulkRequest, ActionListener<BulkResponse>), and related overloads without providing @Deprecated bridge methods. Any OSGi plugin compiled against the N-1 interface will fail at activation with NoSuchMethodError.


Minor

8. DualIndexBulkRequest.size() redundant condition (line 311): isMigrationComplete() || isReadEnabled() simplifies to isReadEnabled() since Phase 3 always implies OS reads. Cosmetic.

9. Missing @Override on interface methods in impl: fullReindexAbort, isDotCMSIndexName, listDotCMSIndices, listDotCMSClosedIndices, activateIndex, deactivateIndex, getActiveIndexName, getIndexDocumentCount(String). The CLAUDE.md style guide calls for progressive enhancement — these should be added.

10. PR description error: The Breaking Changes section says callers should use .timestampSuffix() on IndexStartResult, but that method does not exist. The actual methods are timeStampOS() and timeStampES().


Items 1–3 should be fixed before merge. Items 4–7 are worth a follow-up or discussion.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 10, 2026

Pull Request Unsafe to Rollback!!!

  • Category: M-4 — OSGi Plugin API Breakage

  • Risk Level: 🟡 MEDIUM

  • Why it's unsafe: The ContentletIndexAPI public interface had multiple breaking method signature changes. Any OSGi plugin compiled against the previous version of this interface will fail at activation time with NoSuchMethodError or IncompatibleClassChangeError.

    Breaking changes introduced in this PR:

    1. Method renamed: checkAndInitialiazeIndex()checkAndInitializeIndex() in ContentletIndexAPI. Any plugin calling the old (misspelled) method receives a NoSuchMethodError.
    2. Return type changed: fullReindexStart() now returns IndexStartResult instead of String. Any plugin assigning the result to a String variable will get a ClassCastException at runtime; any plugin that compiled against the old signature will get NoSuchMethodError.
    3. Method removed: putToIndex(BulkRequest, ActionListener<BulkResponse>) was removed from the ContentletIndexAPI interface entirely and replaced with putToIndex(IndexBulkRequest). Any plugin calling the old two-argument form gets NoSuchMethodError.
    4. Listener contract changed: BulkProcessorListener changed from implementing BulkProcessor.Listener (Elasticsearch type) to implementing IndexBulkListener (domain type). Plugins that registered, subclassed, or cast to the old Elasticsearch-typed listener will fail.
  • Code that makes it unsafe:

    • dotCMS/src/main/java/com/dotcms/content/elasticsearch/business/ContentletIndexAPI.java — method checkAndInitializeIndex() (line ~35, was checkAndInitialiazeIndex); method fullReindexStart() return type change (line ~52); removal of putToIndex(BulkRequest, ActionListener<BulkResponse>)
    • dotCMS/src/main/java/com/dotmarketing/common/reindex/BulkProcessorListener.java — class declaration changed from implements BulkProcessor.Listener to implements IndexBulkListener
  • Alternative (if possible): Follow the two-phase OSGi migration pattern:

    • Release N (this PR): Keep the old method signatures as @Deprecated overloads that delegate to the new signatures, so N-1 plugins continue to activate cleanly.
    • Release N+1: Remove the deprecated shims once N-2 is outside the rollback window.

@github-actions github-actions bot added the Area : Documentation PR changes documentation files label Apr 11, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 11, 2026

Pull Request Unsafe to Rollback!!!

  • Category: M-4 — OSGi Plugin API Breakage

  • Risk Level: 🟡 MEDIUM

  • Why it's unsafe: The ContentletIndexAPI public interface and BulkProcessorListener class both reside in OSGi-exported packages (com.dotcms.content.elasticsearch.business and com.dotmarketing.common.reindex, confirmed in osgi-extra.conf lines 92 and 573). Multiple breaking changes were made without backward-compatible deprecated shims:

    1. Method removed (no shim): checkAndInitialiazeIndex() removed from ContentletIndexAPI and replaced by checkAndInitializeIndex() (spelling fix). Any OSGi plugin calling the old spelling receives NoSuchMethodError. The only in-tree caller — InitServlet.java:94 — was updated, but external plugin callers were not.

    2. Return type changed: fullReindexStart() return changed from String to IndexStartResult. Any plugin that assigns the result to a String variable throws ClassCastException; any plugin compiled against the old signature throws NoSuchMethodError.

    3. Methods removed: putToIndex(BulkRequest, ActionListener<BulkResponse>) and the BulkRequest-typed overloads of createBulkRequest(), appendBulkRequest(), appendBulkRemoveRequest() removed from ContentletIndexAPI. Replaced with IndexBulkRequest-typed equivalents. Plugins calling any of these forms get NoSuchMethodError.

    4. Processor API changed: createBulkProcessor(BulkProcessorListener) replaced by createBulkProcessor(IndexBulkListener); appendToBulkProcessor(BulkProcessor, ...) replaced by appendToBulkProcessor(IndexBulkProcessor, ...). Plugins using async bulk processing cannot compile or activate.

    5. Listener contract changed: BulkProcessorListener changed from implements BulkProcessor.Listener (Elasticsearch type) to implements IndexBulkListener (domain type). The beforeBulk and afterBulk method signatures changed (e.g. BulkRequestint actionCount; BulkResponseList<IndexBulkItemResult>). Plugins that subclass, cast, or register BulkProcessorListener as a BulkProcessor.Listener fail at activation.

  • Code that makes it unsafe:

    • dotCMS/src/main/java/com/dotcms/content/elasticsearch/business/ContentletIndexAPI.java — all lines; checkAndInitializeIndex at line 38 (was checkAndInitialiazeIndex); fullReindexStart return at line 52; all BulkRequest/BulkProcessor/ActionListener method signatures replaced throughout
    • dotCMS/src/main/java/com/dotmarketing/common/reindex/BulkProcessorListener.java — class declaration line 34 (implements IndexBulkListener replacing BulkProcessor.Listener); beforeBulk and afterBulk method signatures
    • dotCMS/src/main/resources/osgi/osgi-extra.conf lines 92, 573 — confirm both affected packages are exported to the OSGi container
  • Alternative (if possible): Follow the two-phase OSGi migration pattern:

    • Release N (this PR): Keep each removed/changed method as a @Deprecated overload that delegates to the new signature. For example, add @Deprecated default void checkAndInitialiazeIndex() { checkAndInitializeIndex(); } as an interface default method; add a @Deprecated default String fullReindexStart() { return fullReindexStart_new().timestampSuffix(); } shim (renaming the new method accordingly). N-1 plugins continue to activate cleanly against these shims.
    • Release N+1: Remove the deprecated shims once N-2 is outside the rollback window.

@fabrizzio-dotCMS fabrizzio-dotCMS marked this pull request as ready for review April 13, 2026 18:39
@fabrizzio-dotCMS fabrizzio-dotCMS requested a review from wezell April 13, 2026 20:15
fabrizzio-dotCMS and others added 2 commits April 13, 2026 14:37
…Suite

Verifies the full createContentIndex pipeline on OS:
- index creation and existence
- dynamic templates stored in mapping (template_1, textmapping, geomapping, keywordmapping)
- dynamic templates fire on real documents (*_dotraw→keyword, *_text→text, *latlon→geo_point)
- auto_expand_replicas=0-1 index setting
- custom analysers (my_analyzer, dot_comma_analyzer) from os-content-settings.json

Also adds OS endpoint config to dotcms-config-cluster.properties for the test runner.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[TASK] Migrate ESMappingAPIImpl

1 participant