New domain: SWE by ehsk · Pull Request #138 · ServiceNow/PipelineRL

ehsk · 2026-04-22T16:48:32Z

RL on SWE-style tasks like SWE-smith or SWE-bench added.

Experimented with Qwen2.5-7B-Instruct (grey) and Qwen3-8B (orange).

Credit to: @amilios (this code is mostly taken from his implementation)

amilios

lgtm

rafapi

we should also add the this new domain to the multi-domain integration layer:

pipelinerl/domains/multidomain/loader.py
conf/multi_domain/base.yaml
conf/domain_rollouts/base.yaml

rafapi · 2026-05-11T19:50:41Z


        return CausalLMOutputWithValue(
-            loss=outputs.loss,
            logits=outputs.logits,


value-head model will fail now, we should revert this

The loss is computed outside, we only need to return the value head tensors: https://github.com/ServiceNow/PipelineRL/blob/swe/pipelinerl/finetune/rl/__init__.py#L262

I see, that should be fine, not sure it's the most flexible way to use a value head though. But anyway, with this change i see a failure path in pipelinerl/finetune/eval.py. Also, you should remove the loss from CausalLMOutputWithValue and labels from forward()

Sure, removed the extra args: labels in forward and loss in CausalLMOutputWithValue

About finetune/eval.py, in theory, yes but eval.py is an old code that's never used anywhere else.

ehsk added 2 commits April 22, 2026 16:45

initial commit for adding swe tasks as a new domain

355b863

loss never used from model output so better not to pass labels

997ebec

ehsk requested review from amilios and rafapi April 22, 2026 16:49

Merge branch 'main' into swe

b917dd8

amilios approved these changes Apr 28, 2026

View reviewed changes

edit parser improved to support "```" blocks as optional

d35249c

rafapi requested changes May 11, 2026

View reviewed changes

ehsk added 3 commits May 12, 2026 19:48

Merge branch 'main' into swe

0a23d2e

default datasets updated

3aa4103

unused parameters removed

7f1e8a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New domain: SWE#138

New domain: SWE#138
ehsk wants to merge 7 commits into
mainfrom
swe

ehsk commented Apr 22, 2026 •

edited

Loading

Uh oh!

amilios left a comment

Uh oh!

rafapi left a comment

Uh oh!

rafapi May 11, 2026

Uh oh!

ehsk May 12, 2026

Uh oh!

rafapi May 13, 2026

Uh oh!

ehsk May 13, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ehsk commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amilios left a comment

Choose a reason for hiding this comment

Uh oh!

rafapi left a comment

Choose a reason for hiding this comment

Uh oh!

rafapi May 11, 2026

Choose a reason for hiding this comment

Uh oh!

ehsk May 12, 2026

Choose a reason for hiding this comment

Uh oh!

rafapi May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ehsk May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ehsk commented Apr 22, 2026 •

edited

Loading