-
Notifications
You must be signed in to change notification settings - Fork 1k
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
(fix): handle none/dict rewards in _compute_zero_std_metrics
#2157
opened Jul 1, 2026 by
atoniolo76
Loading…
fix(model_provider): add attention_backend and attention_softmax_in_fp32 to provider
#2155
opened Jun 30, 2026 by
lyzustc
Contributor
Loading…
fix: support multi-head MTP weight mapping in MimoBridge (closes #2131)
#2154
opened Jun 30, 2026 by
botbikamordehai2-sketch
Loading…
fix(update_weight): bracket IPv6 master address in tcp:// init_method
#2151
opened Jun 30, 2026 by
realJaydenCheng
Loading…
[docker] Upgrade to sglang v0.5.14
run-ci-image
#2149
opened Jun 29, 2026 by
zhuzilin
Contributor
Loading…
feat(p2p): add shard-level weight update with automatic broadcast fallback
#2146
opened Jun 29, 2026 by
CalvinXKY
Contributor
Loading…
3 of 5 tasks
docs: fix dead examples/README link to low_precision
#2142
opened Jun 29, 2026 by
aoshen02
Contributor
Loading…
docs(examples): fix broken markdown links in rollout_buffer and examples
#2137
opened Jun 27, 2026 by
CalvinXKY
Contributor
Loading…
docs(examples): list coding_agent_rl in examples/README
#2133
opened Jun 26, 2026 by
aoshen02
Contributor
Loading…
Skip entropy gradient computation when entropy_coef == 0
#2130
opened Jun 25, 2026 by
CSUN1997
Loading…
Support partial rollout resume in Search-R1 example
#2128
opened Jun 23, 2026 by
OLIVER-XYP
Loading…
Reduce entropy logging memory when entropy coef is zero
#2127
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template
#2126
opened Jun 23, 2026 by
Meihan-chen
Loading…
Add test for megatron server
run-ci-changed
#2123
opened Jun 23, 2026 by
zhuzilin
Contributor
Loading…
fix(partial-rollout): cap max_new_tokens by prior response length
#2122
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(retool): coerce list prompt to str in reward_func
#2120
opened Jun 23, 2026 by
mvanhorn
Loading…
fix(delta-sync): surface failed engine apply results instead of silently discarding them
#2119
opened Jun 22, 2026 by
tanishkasinghhh
Loading…
fix(rm_hub): grade the final ###Response segment in deepscaler reward
#2116
opened Jun 22, 2026 by
SuperMarioYL
Loading…
fix(rm_hub): guard deepscaler reward against a missing response
#2115
opened Jun 21, 2026 by
vjsai
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.