忍者

byteprobe

AI & ML interests

RL | NLP | LLM | multimodal | evaluations | agents

Recent Activity

liked a model 26 days ago

moonshotai/Kimi-VL-A3B-Thinking-2506

upvoted a changelog 26 days ago

Organization and User profiles now include repository listing pages

liked a dataset 26 days ago

nvidia/OpenScience

View all activity

Organizations

liked a model 26 days ago

moonshotai/Kimi-VL-A3B-Thinking-2506

Image-Text-to-Text • 16B • Updated 21 days ago • 43.6k • 219

upvoted a changelog 26 days ago

Changelog

Organization and User profiles now include repository listing pages

28 days ago

• 108

liked a dataset 26 days ago

nvidia/OpenScience

Viewer • Updated 30 days ago • 4.48M • 3.45k • 63

liked 2 models 26 days ago

google/magenta-realtime

Updated 4 days ago • 246 • 446

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Image-Text-to-Text • 24B • Updated 11 days ago • 126k • 364

liked 2 datasets 28 days ago

miriad/miriad-5.8M

Viewer • Updated Jun 11 • 5.82M • 2.11k • 49

EssentialAI/essential-web-v1.0

Preview • Updated 26 days ago • 434k • 183

upvoted 2 papers 28 days ago

Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15 • 61

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17 • 41

liked a dataset 28 days ago

nvidia/AceReason-1.1-SFT

Viewer • Updated 30 days ago • 3.96M • 13.1k • 70

liked 2 models 28 days ago

vrgamedevgirl84/Wan14BT2VFusioniX

Text-to-Video • Updated 26 days ago • 451

inclusionAI/Ming-Lite-Omni

Any-to-Any • 19B • Updated 18 days ago • 496 • 160

liked a Space 29 days ago

121

Open-LLM performances are plateauing, let’s make the leaderboard steep again

🏔

Update leaderboard for fair model evaluation

upvoted 2 papers about 1 month ago

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 66

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 128

liked a model about 1 month ago

nvidia/AceReason-Nemotron-1.1-7B

Text Generation • 8B • Updated 7 days ago • 55.6k • • 53

upvoted a paper about 1 month ago

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

Paper • 2506.10521 • Published Jun 12 • 71

liked a model about 1 month ago

QuixiAI/Qwen3-72B-Embiggened

73B • Updated Jun 14 • 89 • 19

liked a dataset about 1 month ago

lingshu-medical-mllm/ReasonMed

Viewer • Updated 24 days ago • 1.11M • 1.58k • 52

upvoted a paper about 1 month ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11 • 97

忍者

AI & ML interests

Recent Activity

Organizations

byteprobe's activity

Organization and User profiles now include repository listing pages

Open-LLM performances are plateauing, let’s make the leaderboard steep again