Skip to content

fix: remove duplicate rules from SARIF output#583

Open
danskmt wants to merge 1 commit intomainfrom
fix/CLI-1344-deduplicate-rules-in-ufm-sarif
Open

fix: remove duplicate rules from SARIF output#583
danskmt wants to merge 1 commit intomainfrom
fix/CLI-1344-deduplicate-rules-in-ufm-sarif

Conversation

@danskmt
Copy link
Copy Markdown
Contributor

@danskmt danskmt commented Apr 9, 2026

Description

Deduplicates SARIF rules by problem ID in UFM output to fix duplicate rule entries. When multiple findings share the same problem ID (e.g., two generic-api-key secrets found in different files), the SARIF output previously emitted one rule per finding, producing duplicate rule IDs in the rules array. Per the SARIF v2.1.0 spec, rule IDs must be unique within the rules array.

The fix adds a deduplicateIssuesByProblemID template function that filters the issues list to one entry per unique problem ID (first-wins) for the rules loop only. The results loop is unchanged — all findings still appear as individual results referencing their shared rule.

Relevant ticket: CLI-1344

Checklist

  • Tests added and all succeed (make test)
  • Regenerated mocks, etc. (make generate)
  • Linted (make lint)
  • Test your changes work for the CLI
    1. Clone / pull the latest CLI main.
    2. Run go get github.com/snyk/go-application-framework@YOUR_LATEST_GAF_COMMIT in the cliv2 directory.
      • Tip: for local testing, you can uncomment the line near the bottom of the CLI's go.mod to point to your local GAF code.
    3. Run go mod tidy in the cliv2 directory.
    4. Run the CLI tests and do any required manual testing.
    5. Open a PR in the CLI repo now with the go.mod and go.sum changes.
    • Once this PR is merged, repeat these steps, but pointing to the latest GAF commit on main and update your CLI PR.

@snyk-io
Copy link
Copy Markdown

snyk-io bot commented Apr 9, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@snyk-io
Copy link
Copy Markdown

snyk-io bot commented Apr 9, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@danskmt danskmt force-pushed the fix/CLI-1344-deduplicate-rules-in-ufm-sarif branch from e880cea to 6c302d1 Compare April 9, 2026 12:50
@danskmt danskmt marked this pull request as ready for review April 10, 2026 11:40
@danskmt danskmt requested review from a team as code owners April 10, 2026 11:40
@danskmt danskmt force-pushed the fix/CLI-1344-deduplicate-rules-in-ufm-sarif branch from 6c302d1 to 8326f16 Compare April 10, 2026 11:44
@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot

This comment has been minimized.

@danskmt danskmt force-pushed the fix/CLI-1344-deduplicate-rules-in-ufm-sarif branch from 8326f16 to 0edcf10 Compare April 10, 2026 13:00
@snyk-pr-review-bot

This comment has been minimized.

@danskmt danskmt force-pushed the fix/CLI-1344-deduplicate-rules-in-ufm-sarif branch from 0edcf10 to 7498905 Compare April 10, 2026 13:23
@danskmt danskmt changed the title fix: deduplicate SARIF rules by problem ID in UFM output fix: deduplicate SARIF rules in UFM output Apr 10, 2026
@danskmt danskmt changed the title fix: deduplicate SARIF rules in UFM output fix: remove duplicate rules from SARIF output Apr 10, 2026
@danskmt danskmt force-pushed the fix/CLI-1344-deduplicate-rules-in-ufm-sarif branch from 7498905 to e28c523 Compare April 10, 2026 13:26
@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot
Copy link
Copy Markdown

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Lossy Metadata in Rule Generation 🟡 [minor]

The deduplicateIssuesByProblemID function performs a 'first-wins' deduplication based on problem ID. If two findings share a problem ID but have different metadata used by SARIF rule generation functions (like different severities or CVSS scores extracted via getRuleCVSSScore or buildRuleShortDescription), the SARIF rule definition will only reflect the metadata of whichever finding appeared first in the input list. While this satisfies the SARIF requirement for unique rule IDs, it may result in inaccurate rule descriptions if the findings are not perfectly identical in their vulnerability definitions.

func deduplicateIssuesByProblemID(issues []testapi.Issue) []testapi.Issue {
	seen := make(map[string]bool)
	result := make([]testapi.Issue, 0, len(issues))
	for _, issue := range issues {
		id := issue.GetProblemID()
		if id == "" {
			id = issue.GetID()
		}
		if !seen[id] {
			seen[id] = true
			result = append(result, issue)
		}
	}
	return result
}
📚 Repository Context Analyzed

This review considered 10 relevant code sections from 8 files (average relevance: 0.82)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant