task(Migrate ESMappingAPIImpl) Refs #34933#35289
task(Migrate ESMappingAPIImpl) Refs #34933#35289fabrizzio-dotCMS wants to merge 13 commits intomainfrom
Conversation
|
Claude finished @fabrizzio-dotCMS's task in 4m 26s —— View job PR Review: task(Migrate ESMappingAPIImpl) Refs #34933
The architecture (PhaseRouter, CompositeBulkProcessor, DualIndexBulkRequest) is sound. Below are concrete issues found, ordered by severity. Bugs1. Thread-safety violation —
final Date startTime = IndiciesInfo.timestampFormatter.parse(ts);
2. Wrong cluster prefix applied in
long getIndexDocumentCount(final String indexName, final IndexTag tag) {
final String name = indexAPI.getNameWithClusterIDPrefix(indexName); // ← phase-aware: ES in Phase 1
ContentletIndexOperations operations = router.esImpl();
if (tag == IndexTag.OS) {
operations = router.osImpl();
}
return operations.getIndexDocumentCount(name); // ← OS client gets an ES-prefixed name
}
3. Misleading log message in
} catch (final Exception e) {
Logger.warnAndDebug(this.getClass(),
"OS shadow write failed in putToIndex — ES write succeeded; ...");
}The "ES write succeeded" text is hardcoded. It appears even when ES has thrown and Design / Risk4.
for (final ContentletIndexOperations ops : router.writeProviders()) {
try {
result &= ops.createContentIndex(physicalName, shards);
} catch (Exception e) {
Logger.error(...);
result = false;
}
}Both ES (primary) and OS (shadow) exceptions are treated identically: logged, then silently absorbed. The public method 5. Downcast to
final IndexAPIImpl impl = (IndexAPIImpl) indexAPI;The 6. Stale comment contradicts implementation
This is directly contradicted by 7. OSGi API breakage (flagged twice previously — still unaddressed) The two prior automated reviews (April 10 and April 11) both flagged Minor8. 9. Missing 10. PR description error: The Breaking Changes section says callers should use Items 1–3 should be fixed before merge. Items 4–7 are worth a follow-up or discussion. |
|
Pull Request Unsafe to Rollback!!!
|
…/core into issue-34933-Mapping-Layer
|
Pull Request Unsafe to Rollback!!!
|
…/core into issue-34933-Mapping-Layer
…/core into issue-34933-Mapping-Layer
…Suite Verifies the full createContentIndex pipeline on OS: - index creation and existence - dynamic templates stored in mapping (template_1, textmapping, geomapping, keywordmapping) - dynamic templates fire on real documents (*_dotraw→keyword, *_text→text, *latlon→geo_point) - auto_expand_replicas=0-1 index setting - custom analysers (my_analyzer, dot_comma_analyzer) from os-content-settings.json Also adds OS endpoint config to dotcms-config-cluster.properties for the test runner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Migrates
ContentletIndexAPIImpl— the central content indexing implementation — to the vendor-neutral Phase-aware router pattern. All Elasticsearch-specific types (BulkRequest,BulkProcessor,ActionListener) are replaced with domain-layer abstractions (IndexBulkRequest,IndexBulkProcessor,IndexBulkListener), enabling dual-write routing to both ES and OS backends during the ES→OS migration without leaking vendor types to callers.Changes
Backend — Core Indexing
ContentletIndexAPI(interface): Removed allorg.elasticsearch.*imports from method signatures; replaced with vendor-neutralIndexBulkRequest,IndexBulkProcessor, andIndexBulkListenerdomain types. Added@IndexLibraryIndependentannotation. ChangedfullReindexStart()return type fromStringtoIndexStartResult. Added thread-safeDateTimeFormatteralongside deprecatedSimpleDateFormat.ContentletIndexAPIImpl: Full rewrite to a phase-aware router:ContentletIndexOperationsinstances (operationsES,operationsOS) and delegates viaPhaseRouter<ContentletIndexOperations>DualIndexBulkRequestinner class fans out synchronous bulk batches to both ES and OS in dual-write phasesCompositeBulkProcessorinner class fans out async bulk processing; OS failures are fire-and-forget (shadow) in phases 1/2, propagating only in phase 3ProviderIndicesinner class resolves working/live/reindex index names per provider, strippingos::vendor tags before passing to the OS clientContentMappingAPIviaAtomicReferenceto break the circular dependency chainorg.elasticsearch.*imports from the implementationIndexTag: New utility for stripping vendor-specific index name prefixes (os::)IndexStartResult: New domain value object replacing the rawStringreturned byfullReindexStart()BulkProcessorListener: Updated to implementIndexBulkListenerinstead of the ES-specific listenerReindexThread: Minor updates to useIndexBulkProcessor/IndexBulkRequestinstead of ES typesMappingHelper: ReplacesESMappingUtilHelperas the vendor-neutral mapping utilityContentletIndexOperationsES/ContentletIndexOperationsOS: Updated to expose the new domain type APIsTesting
ContentletIndexAPIImplTest: Updated to use the new domain types and refactored test constructorsESSiteSearchAPITest,EMAWebInterceptorTest,CleanUpFieldReferencesJobTest: Minor compile fixes following the API changesTesting
just test-integration-ideReindexThreadcompletes without errors in both Phase 0 (ES only) and Phase 1 (dual-write) configurationsBreaking Changes
ContentletIndexAPI.fullReindexStart()now returnsIndexStartResultinstead ofString. Callers previously using the raw timestamp string must call.timestampSuffix()on the result.BulkRequest,BulkProcessor, andActionListener<BulkResponse>removed from theContentletIndexAPIinterface. Any code directly casting or importing these types against the interface will not compile.This PR fixes: #34933