The post TorchForge RL Pipelines Now Operable on Together AI’s Cloud appeared on BitcoinEthereumNews.com. Jessie A Ellis Dec 04, 2025 17:54 Together AI introduces TorchForge RL pipelines on its cloud platform, enhancing distributed training and sandboxed environments with a BlackJack training demo. TorchForge reinforcement learning (RL) pipelines are now seamlessly operable on Together AI’s Instant Clusters, offering robust support for distributed training, tool execution, and sandboxed environments, as demonstrated by an open-source BlackJack training demo, according to together.ai. The AI Native Cloud: Foundation for Next-Gen RL In the rapidly evolving field of reinforcement learning, building flexible and scalable systems necessitates compatible and efficient compute frameworks and tooling. Modern RL pipelines have transcended basic training loops, now relying heavily on distributed rollouts, high-throughput inference, and a coordinated use of CPU and GPU resources. The comprehensive PyTorch stack, inclusive of TorchForge and Monarch, now operates with distributed training capabilities on Together Instant Clusters. These clusters provide: Low-latency GPU communication: Utilizing InfiniBand/NVLink topologies for efficient RDMA-based data transfers and distributed actor messaging. Consistent cluster bring-up: Preconfigured with drivers, NCCL, CUDA, and the GPU operator, enabling PyTorch distributed jobs to run without manual setup. Heterogeneous RL workload scheduling: Optimized GPU nodes for policy replicas and trainers, alongside CPU-optimized nodes for environment and tool execution. Together AI’s clusters are aptly suited for RL frameworks that require a blend of GPU-bound model computation and CPU-bound environment workloads. Advanced Tool Integration and Demonstration A significant portion of RL workloads involves executing tools, running code, or interacting with sandboxed environments. Together AI’s platform natively supports these requirements through: Together CodeSandbox: MicroVM environments tailored for tool-use, coding tasks, and simulations. Together Code Interpreter: Facilitates fast, isolated Python execution suitable for unit-test-based reward functions or code-evaluation tasks. Both CodeSandbox and Code Interpreter integrate with OpenEnv and TorchForge environment services, allowing rollout workers to utilize these tools… The post TorchForge RL Pipelines Now Operable on Together AI’s Cloud appeared on BitcoinEthereumNews.com. Jessie A Ellis Dec 04, 2025 17:54 Together AI introduces TorchForge RL pipelines on its cloud platform, enhancing distributed training and sandboxed environments with a BlackJack training demo. TorchForge reinforcement learning (RL) pipelines are now seamlessly operable on Together AI’s Instant Clusters, offering robust support for distributed training, tool execution, and sandboxed environments, as demonstrated by an open-source BlackJack training demo, according to together.ai. The AI Native Cloud: Foundation for Next-Gen RL In the rapidly evolving field of reinforcement learning, building flexible and scalable systems necessitates compatible and efficient compute frameworks and tooling. Modern RL pipelines have transcended basic training loops, now relying heavily on distributed rollouts, high-throughput inference, and a coordinated use of CPU and GPU resources. The comprehensive PyTorch stack, inclusive of TorchForge and Monarch, now operates with distributed training capabilities on Together Instant Clusters. These clusters provide: Low-latency GPU communication: Utilizing InfiniBand/NVLink topologies for efficient RDMA-based data transfers and distributed actor messaging. Consistent cluster bring-up: Preconfigured with drivers, NCCL, CUDA, and the GPU operator, enabling PyTorch distributed jobs to run without manual setup. Heterogeneous RL workload scheduling: Optimized GPU nodes for policy replicas and trainers, alongside CPU-optimized nodes for environment and tool execution. Together AI’s clusters are aptly suited for RL frameworks that require a blend of GPU-bound model computation and CPU-bound environment workloads. Advanced Tool Integration and Demonstration A significant portion of RL workloads involves executing tools, running code, or interacting with sandboxed environments. Together AI’s platform natively supports these requirements through: Together CodeSandbox: MicroVM environments tailored for tool-use, coding tasks, and simulations. Together Code Interpreter: Facilitates fast, isolated Python execution suitable for unit-test-based reward functions or code-evaluation tasks. Both CodeSandbox and Code Interpreter integrate with OpenEnv and TorchForge environment services, allowing rollout workers to utilize these tools…

TorchForge RL Pipelines Now Operable on Together AI’s Cloud

2025/12/06 15:05


Jessie A Ellis
Dec 04, 2025 17:54

Together AI introduces TorchForge RL pipelines on its cloud platform, enhancing distributed training and sandboxed environments with a BlackJack training demo.

TorchForge reinforcement learning (RL) pipelines are now seamlessly operable on Together AI’s Instant Clusters, offering robust support for distributed training, tool execution, and sandboxed environments, as demonstrated by an open-source BlackJack training demo, according to together.ai.

The AI Native Cloud: Foundation for Next-Gen RL

In the rapidly evolving field of reinforcement learning, building flexible and scalable systems necessitates compatible and efficient compute frameworks and tooling. Modern RL pipelines have transcended basic training loops, now relying heavily on distributed rollouts, high-throughput inference, and a coordinated use of CPU and GPU resources.

The comprehensive PyTorch stack, inclusive of TorchForge and Monarch, now operates with distributed training capabilities on Together Instant Clusters. These clusters provide:

  • Low-latency GPU communication: Utilizing InfiniBand/NVLink topologies for efficient RDMA-based data transfers and distributed actor messaging.
  • Consistent cluster bring-up: Preconfigured with drivers, NCCL, CUDA, and the GPU operator, enabling PyTorch distributed jobs to run without manual setup.
  • Heterogeneous RL workload scheduling: Optimized GPU nodes for policy replicas and trainers, alongside CPU-optimized nodes for environment and tool execution.

Together AI’s clusters are aptly suited for RL frameworks that require a blend of GPU-bound model computation and CPU-bound environment workloads.

Advanced Tool Integration and Demonstration

A significant portion of RL workloads involves executing tools, running code, or interacting with sandboxed environments. Together AI’s platform natively supports these requirements through:

  • Together CodeSandbox: MicroVM environments tailored for tool-use, coding tasks, and simulations.
  • Together Code Interpreter: Facilitates fast, isolated Python execution suitable for unit-test-based reward functions or code-evaluation tasks.

Both CodeSandbox and Code Interpreter integrate with OpenEnv and TorchForge environment services, allowing rollout workers to utilize these tools during training.

BlackJack Training Demo

Together AI has released a demonstration of a TorchForge RL pipeline running on its Instant Clusters, interacting with an OpenEnv environment hosted on Together CodeSandbox. This demo, adapted from a Meta reference implementation, trains a Qwen 1.5B model to play BlackJack using GRPO. The RL pipeline integrates a vLLM policy server, BlackJack environment, reference model, off-policy replay buffer, and a TorchTitan trainer—connected through Monarch’s actor mesh and using TorchStore for weight synchronization.

The OpenEnv GRPO BlackJack repository includes Kubernetes manifests and setup scripts. Deployment and training initiation are streamlined with simple kubectl commands, allowing experimentation with model configurations and GRPO hyperparameter adjustments.

Additionally, a standalone integration wraps Together’s Code Interpreter as an OpenEnv environment, enabling RL agents to interact with the Interpreter like any other environment. This integration allows RL pipelines to be applied to diverse tasks such as coding and mathematical reasoning.

The demonstrations highlight that sophisticated, multi-component RL training can be conducted on the Together AI Cloud with ease, setting the stage for a flexible, open RL framework in the PyTorch ecosystem, scalable on the Together AI Cloud.

Image source: Shutterstock

Source: https://blockchain.news/news/torchforge-rl-pipelines-operable-together-ai-cloud

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why is XRP price crashing as the Ripple ETF inflows soar?

Why is XRP price crashing as the Ripple ETF inflows soar?

The post Why is XRP price crashing as the Ripple ETF inflows soar? appeared on BitcoinEthereumNews.com. XRP price has tanked for three consecutive days, erasing the gains made earlier this week, even as the recently launched ETFs gained momentum.  Summary XRP price suffered a harsh reversal as the recent rally stalled. Spot XRP ETFs continued seeing strong inflows this week. Technical analysis suggests that the token has more downside. Ripple (XRP) token dropped to $2.03 today, Dec. 6, down by over 44% from its highest point this year. This crash has shed billions of dollars in value, a move that has brought its market cap to $120 billion. XRP price has dropped even as its key fundamentals have strengthened. One of them is that investors have continued piling into its recently launched ETFs. Data compiled by SoSoValue shows that the funds have never had a day of outflows. They added $10.2 million in assets on Friday, bringing the weekly gain to $230 million.  Consequently, these XRP ETFs have now had over $897 million in inflows, with Canary’s XRPC leading the charge with over $363 million. Grayscale’s GXRP, Bitwise’s XRP, and Franklin Templeton’s XRPZ have attracted $211 million, $187 million, and $134 million in inflows, respectively.  The four ETFs now hold over $861 million in assets under management. With the REX-Osprey ETF included, these funds now hold over $972 million in assets.  Therefore, the XRP price has dropped because of the ongoing sentiment in the crypto market, which is deteriorating. Bitcoin and other altcoins have erased most of the gains made earlier this week as futures open interest drops and liquidations rise. XRP positions worth over $7.6 million were liquidated in the last 24 hours, leading to more selling pressure.  XRP price technicals explain the crash Ripple price chart | Source: crypto.news Technical analysis also explains the ongoing XRP price crash as it started when it retested…
Share
BitcoinEthereumNews2025/12/06 19:57