The post Together AI Launches DSGym Framework for Training Data Science AI Agents appeared on BitcoinEthereumNews.com. Rebeca Moen Jan 26, 2026 23:09 TogetherThe post Together AI Launches DSGym Framework for Training Data Science AI Agents appeared on BitcoinEthereumNews.com. Rebeca Moen Jan 26, 2026 23:09 Together

Together AI Launches DSGym Framework for Training Data Science AI Agents

For feedback or concerns regarding this content, please contact us at [email protected]


Rebeca Moen
Jan 26, 2026 23:09

Together AI’s DSGym framework benchmarks LLM agents on 90+ bioinformatics tasks and 92 Kaggle competitions. Their 4B parameter model matches larger rivals.

Together AI has released DSGym, a comprehensive framework for evaluating and training AI agents designed to perform data science tasks autonomously. The framework includes over 90 bioinformatics challenges and 92 Kaggle competition datasets, providing standardized benchmarks that address fragmentation issues plaguing existing evaluation methods.

The standout claim: Together AI’s 4 billion parameter model, trained using DSGym’s synthetic trajectory generation, achieves performance competitive with models 50 times its size on certain benchmarks.

Benchmark Results Show Surprising Efficiency

The published benchmarks reveal interesting performance dynamics across model sizes. Together AI’s Qwen3-4B-DSGym-SFT-2k model—fine-tuned using the framework—scored 59.36% on QRData-Verified and 77.78% on DABStep-easy tasks. That puts it ahead of the base Qwen3-4B-Instruct model (45.27% and 58.33% respectively) and competitive with models like Deepseek-v3.1 and GPT-OSS-120B on several metrics.

Claude 4.5 Sonnet currently leads the pack on harder tasks, hitting 37.04% on DABStep-hard compared to the fine-tuned 4B model’s 33.07%. But the gap narrows considerably given the massive difference in model scale.

Kimi-K2-Instruct posted the highest QRData-Verified score at 63.68%, while GPT-4o achieved 92.26% on DAEval-Verified—suggesting different architectures excel at different task types.

Why This Matters for AI Development

DSGym tackles a real problem in the AI agent space. Current benchmarks suffer from inconsistent evaluation interfaces and limited task diversity, making it difficult to compare agent performance meaningfully. The framework’s modular architecture allows researchers to add new tasks, agent scaffolds, and tools without rebuilding from scratch.

The execution-verified data synthesis pipeline is particularly notable. Rather than training on static datasets, the system generates synthetic training trajectories that are validated through actual code execution—reducing the garbage-in-garbage-out problem that hampers many AI training pipelines.

For companies building AI-powered data analysis tools, DSGym provides a standardized way to measure progress. The bioinformatics focus (DSBio) and prediction task coverage (DSPredict) extend beyond generic coding benchmarks into domain-specific applications where AI agents could deliver real productivity gains.

What’s Next

The framework is positioned as an evolving testbed rather than a static benchmark suite. Together AI has emphasized the extensibility angle, suggesting they’ll continue adding task categories and evaluation metrics. With AI agent development accelerating across the industry, having a common evaluation standard could help separate genuine capability improvements from benchmark gaming—though that’s always easier said than done.

Image source: Shutterstock

Source: https://blockchain.news/news/together-ai-launches-dsgym-framework-training-data-science-agents

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Stunning 96% Surge And 50% Plunge Define Volatile Market Session

Stunning 96% Surge And 50% Plunge Define Volatile Market Session

The post Stunning 96% Surge And 50% Plunge Define Volatile Market Session appeared on BitcoinEthereumNews.com. Crypto Gainers And Losers: Stunning 96% Surge And
Share
BitcoinEthereumNews2026/04/03 09:20
Come Back To Me’ To Air At BIFF Before Global Release

Come Back To Me’ To Air At BIFF Before Global Release

The post Come Back To Me’ To Air At BIFF Before Global Release appeared on BitcoinEthereumNews.com. Kim Woo-sung performs onstage during “The Rose: Come Back to Me” premiere during the 2025 Tribeca Festival. Photo by Roy Rochlin/Getty Images for Tribeca Festival) Getty Images for Tribeca Festival The Rose: Come Back To Me will screen three times at the Busan International Film Festival and at additional film festivals worldwide, before its global theatrical release in 2026. The Korean alt-pop indie band known as The Rose is composed of Woosung, Dojoon, Hajoon, and Taegyeom. From their earliest days,busking in Hongdae, the band has captivated audiences with their distinctive genre-blending sound. Their first full-length album Heal sparked the global Heal Together World Tour, drawing over 90,000 fans and leading to high-profile festival appearances, including headlining the Bacardi Stage at Lollapalooza 2023. They reached a new milestone with their sophomore album Dual, which debuted on the Billboard 200. Building on this success, The Rose sold more than 150,000 tickets on their Dawn to Dusk Tour and delivered a show-stopping set at Coachella 2024. This year they went on a global tour, promoting their latest album WRLD alongside their documentary The Rose: Come Back to Me, which premiered at the Tribeca Film Festival in June 2025. “Knowing how dominant Korean culture is globally—from K-Pop Demon Hunters to Parasite—international audiences are all eager to go deeper and learn more” said Diane Quon and Sanjay M. Sharma on behalf of the producing team behind the popular Tribeca doc. “The Rose is as much a music doc as it is a coming-of-age story—about a group of friends finding their own way through the world. It’s a story of heartbreak and healing, conformity and individuality, and ultimately about the transformative power of music around the world.” Hajoon, Taegyeom, Kim Woo-sung and Dojoon perform onstage during “The Rose: Come Back to Me” premiere.. (Photo by Roy…
Share
BitcoinEthereumNews2025/09/19 06:53
Hong Kong Monetary Authority cuts interest rates by 25 basis points

Hong Kong Monetary Authority cuts interest rates by 25 basis points

PANews reported on September 18 that according to Jinshi, the Hong Kong Monetary Authority lowered the benchmark interest rate by 25 basis points to 4.50%, and the Federal Reserve cut interest rates by 25 basis points overnight.
Share
PANews2025/09/18 08:06

Trade GOLD, Share 1,000,000 USDT

Trade GOLD, Share 1,000,000 USDTTrade GOLD, Share 1,000,000 USDT

0 fees, up to 1,000x leverage, deep liquidity