Skip to content

Commit a456cd7

Browse files
authored
feat: allow filtering contributions and collaborations [IN-707] (#3432)
Signed-off-by: Raúl Santos <4837+borfast@users.noreply.github.com>
1 parent 09ccc0c commit a456cd7

9 files changed

Lines changed: 63 additions & 22 deletions

services/libs/tinybird/pipes/activities_filtered.pipe

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,17 @@ DESCRIPTION >
22
- `activities_filtered.pipe` is the core filtering infrastructure pipe for activity data across the entire analytics platform.
33
- This pipe serves as the foundation for most activity-related widgets, by providing a consistent, filtered view of contribution activities.
44
- It filters activities from `activityRelations_deduplicated_cleaned_ds` datasource based on project segment, time ranges, repositories, platforms, and activity types.
5-
- By default, this pipe returns only contribution activities (`isContribution = 1`) unless explicitly overridden with `onlyContributions = 0`.
65
- The pipe automatically scopes data to the current project using `segments_filtered` pipe for security and data isolation.
76
- Parameters:
87
- `project`: Inherited from `segments_filtered`, project slug (e.g., 'k8s', 'tensorflow')
9-
- `repos`: Inherited from `segments_filtered`, array of repository URLs for filtering
8+
- `repos`: Optional array of repository URLs for filtering (e.g., ['https://github.com/kubernetes/kubernetes']). Inherited from `segments_filtered`.
109
- `startDate`: Optional DateTime filter for activities after timestamp (e.g., '2024-01-01 00:00:00')
1110
- `endDate`: Optional DateTime filter for activities before timestamp (e.g., '2024-12-31 23:59:59')
12-
- `repos`: Optional array of repository URLs (e.g., ['https://github.com/kubernetes/kubernetes'])
1311
- `platform`: Optional string filter for source platform (e.g., 'github', 'discord', 'slack')
1412
- `activity_type`: Optional string filter for single activity type (e.g., 'authored-commit')
1513
- `activity_types`: Optional array of activity types (e.g., ['authored-commit', 'co-authored-commit'])
16-
- `onlyContributions`: Optional boolean, defaults to 1 (contributions only), set to 0 for all activities
14+
- `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude. Inherited from activityTypes_filtered.
15+
- `includeCollaborations`: Optional boolean to include or exclude collaboration activities. Inherited from activityTypes_filtered.
1716
- Response: `id` (activityId), `timestamp`, `type`, `platform`, `memberId`, `organizationId`, `segmentId`.
1817
- This pipe is consumed by many of downstream pipes and widgets across the platform for consistent activity filtering.
1918
- Performance is optimized through proper sorting keys on `segmentId`, `timestamp`, `type`, `platform`, and `memberId` in the source datasource.
@@ -41,10 +40,7 @@ SQL >
4140
AND a.platform
4241
= {{ String(platform, description="Filter activity platform", required=False) }}
4342
{% end %}
44-
{% if (
45-
not defined(onlyContributions)
46-
or (defined(onlyContributions) and onlyContributions == 1)
47-
) %} AND a.isContribution {% end %}
43+
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
4844
{% if defined(activity_type) %}
4945
AND a.type = {{ String(activity_type, description="Filter activity type", required=False) }}
5046
{% end %}

services/libs/tinybird/pipes/activities_filtered_historical_cutoff.pipe

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ DESCRIPTION >
1414
- `platform`: Optional string filter for source platform (e.g., 'github', 'discord', 'slack')
1515
- `activity_type`: Optional string filter for single activity type (e.g., 'authored-commit')
1616
- `activity_types`: Optional array of activity types (e.g., ['authored-commit', 'co-authored-commit'])
17-
- `onlyContributions`: Optional boolean, defaults to 1 (contributions only), set to 0 for all activities
17+
- `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude. Inherited from activityTypes_filtered.
18+
- `includeCollaborations`: Optional boolean to include or exclude collaboration activities. Inherited from activityTypes_filtered.
1819
- Response: `id` (activityId), `timestamp`, `type`, `platform`, `memberId`, `organizationId`, `segmentId`.
1920

2021
NODE activities_filtered_by_timestamp_and_channel
@@ -41,9 +42,7 @@ SQL >
4142
AND a.platform
4243
= {{ String(platform, description="Filter activity platform", required=False) }}
4344
{% end %}
44-
{% if not defined(onlyContributions) or (
45-
defined(onlyContributions) and onlyContributions == 1
46-
) %} AND a.isContribution {% end %}
45+
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
4746
{% if defined(activity_type) %}
4847
AND a.type = {{ String(activity_type, description="Filter activity type", required=False) }}
4948
{% end %}

services/libs/tinybird/pipes/activities_filtered_retention.pipe

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ DESCRIPTION >
1313
- `platform`: Optional string filter for source platform (e.g., 'github', 'discord', 'slack')
1414
- `activity_type`: Optional string filter for single activity type (e.g., 'authored-commit')
1515
- `activity_types`: Optional array of activity types (e.g., ['authored-commit', 'co-authored-commit'])
16-
- `onlyContributions`: Optional boolean, defaults to 1 (contributions only), set to 0 for all activities
16+
- `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude. Inherited from activityTypes_filtered.
17+
- `includeCollaborations`: Optional boolean to include or exclude collaboration activities. Inherited from activityTypes_filtered.
1718
- `granularity`: Required string for time aggregation and period extension ('daily', 'weekly', 'monthly', 'quarterly', 'yearly')
1819
- Response: `id` (activityId), `timestamp`, `type`, `platform`, `memberId`, `organizationId`, `segmentId`.
1920

@@ -57,9 +58,7 @@ SQL >
5758
AND a.platform
5859
= {{ String(platform, description="Filter activity platform", required=False) }}
5960
{% end %}
60-
{% if not defined(onlyContributions) or (
61-
defined(onlyContributions) and onlyContributions == 1
62-
) %} AND a.isContribution {% end %}
61+
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
6362
{% if defined(activity_type) %}
6463
AND a.type = {{ String(activity_type, description="Filter activity type", required=False) }}
6564
{% end %}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
DESCRIPTION >
2+
- `activityTypes_filtered.pipe` allows filtering activityTypes from the respective data source.
3+
- By default, this only returns code contribution activities (`includeCodeContributions = 1`).
4+
- To return all activities, set `includeCodeContributions = 1`, `includeCollaborations = 1`, and `includeOtherContributions = 1`.
5+
- Parameters:
6+
- `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude.
7+
- `includeCollaborations`: Optional boolean to include or exclude collaboration activities.
8+
- `includeOtherContributions`: Optional boolean to include other contribution activities (activities that are neither code contributions nor collaborations).
9+
- Response: `activityType`, `platform`.
10+
- This pipe is used by other downstream pipes as an auxiliary method of filtering data by activity types.
11+
12+
NODE activityTypes_selected
13+
SQL >
14+
%
15+
WITH
16+
{{ UInt8(includeCodeContributions, default=1) }} AS icc,
17+
{{ UInt8(includeCollaborations, default=0) }} AS icol,
18+
{{ UInt8(includeOtherContributions, default=0) }} AS ioc
19+
SELECT activityType, platform
20+
FROM activityTypes
21+
WHERE
22+
(icc = 1 AND isCodeContribution = 1)
23+
OR (icol = 1 AND isCollaboration = 1)
24+
OR (ioc = 1 AND isCodeContribution = 0 AND isCollaboration = 0)

services/libs/tinybird/pipes/health_score_active_contributors.pipe

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ SQL >
77
SELECT segmentId, COALESCE(uniq(memberId), 0) AS activeContributors
88
FROM activityRelations_deduplicated_cleaned_ds
99
WHERE
10-
memberId != '' AND isContribution
10+
memberId != ''
11+
AND (type, platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
1112
{% if defined(project) %}
1213
AND segmentId = (SELECT segmentId FROM segments_filtered)
1314
{% if defined(repos) %}

services/libs/tinybird/pipes/health_score_contributor_dependency.pipe

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ SQL >
44
SELECT segmentId, memberId, count() AS contributionCount, MIN(timestamp), MAX(timestamp)
55
FROM activityRelations_deduplicated_cleaned_ds
66
WHERE
7-
memberId != '' AND isContribution
7+
memberId != ''
8+
AND (type, platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
89
{% if defined(project) %}
910
AND segmentId = (SELECT segmentId FROM segments_filtered)
1011
{% if defined(repos) %}

services/libs/tinybird/pipes/health_score_organization_dependency.pipe

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ SQL >
44
SELECT segmentId, organizationId, count() AS contributionCount
55
FROM activityRelations_deduplicated_cleaned_ds
66
WHERE
7-
organizationId != '' AND isContribution
7+
organizationId != ''
8+
AND (type, platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
89
{% if defined(project) %}
910
AND segmentId = (SELECT segmentId FROM segments_filtered)
1011
{% if defined(repos) %}

services/libs/tinybird/pipes/segmentId_aggregates_mv.pipe

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,10 @@ SQL >
88
countDistinctState(memberId) AS contributorCount,
99
countDistinctState(organizationId) AS organizationCount
1010
FROM activityRelations_deduplicated_cleaned_ds
11-
WHERE isContribution = true
11+
WHERE
12+
(type, platform) IN (
13+
SELECT activityType, platform
14+
FROM activityTypes
15+
WHERE isCodeContribution = 1 OR isCollaboration = 1
16+
)
1217
GROUP BY segmentId

services/libs/tinybird/pipes/top_member_org_copy.pipe

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,14 @@ NODE top_member_org_copy_member_activity_count
1717
SQL >
1818
SELECT memberId, count(*) AS activityCount
1919
FROM activityRelations_deduplicated_cleaned_ds
20-
WHERE (timestamp >= (now() - toIntervalYear(10))) AND (timestamp < now())
20+
WHERE
21+
(timestamp >= (now() - toIntervalYear(10)))
22+
AND (timestamp < now())
23+
AND (type, platform) IN (
24+
SELECT activityType, platform
25+
FROM activityTypes
26+
WHERE isCodeContribution = 1 OR isCollaboration = 1
27+
)
2128
GROUP BY memberId
2229
ORDER BY activityCount DESC
2330
LIMIT 100
@@ -41,7 +48,15 @@ NODE top_member_org_copy_organization_activity_count
4148
SQL >
4249
SELECT organizationId, count(*) AS activityCount
4350
FROM activityRelations_deduplicated_cleaned_ds
44-
WHERE (timestamp >= (now() - toIntervalYear(10))) AND (timestamp < now()) AND organizationId != ''
51+
WHERE
52+
(timestamp >= (now() - toIntervalYear(10)))
53+
AND (timestamp < now())
54+
AND organizationId != ''
55+
AND (type, platform) IN (
56+
SELECT activityType, platform
57+
FROM activityTypes
58+
WHERE isCodeContribution = 1 OR isCollaboration = 1
59+
)
4560
GROUP BY organizationId
4661
ORDER BY activityCount DESC
4762
LIMIT 100

0 commit comments

Comments
 (0)