The post Tether Expands AI Dataset with QVAC Genesis II, Reaching 148 Billion Tokens appeared on BitcoinEthereumNews.com. Timothy Morano Jan 08, 2026 09:00 The post Tether Expands AI Dataset with QVAC Genesis II, Reaching 148 Billion Tokens appeared on BitcoinEthereumNews.com. Timothy Morano Jan 08, 2026 09:00

Tether Expands AI Dataset with QVAC Genesis II, Reaching 148 Billion Tokens



Timothy Morano
Jan 08, 2026 09:00

Tether’s QVAC Genesis II release adds 107 billion tokens to the world’s largest synthetic educational dataset, enhancing AI pre-training across 19 educational domains.

Tether Data’s AI research division, QVAC, has unveiled QVAC Genesis II, significantly expanding its synthetic educational dataset for artificial intelligence pre-training. This latest release adds 107 billion new tokens, bringing the total to 148 billion tokens across 19 educational domains, according to Tether.

Enhancing AI Training

Building on the foundation of QVAC Genesis I, the new release covers 10 additional domains, such as chemistry, computer science, and machine learning. It also updates college-level physics using an advanced methodology. Together, Genesis I and II represent the most comprehensive synthetic educational dataset available to the public.

Innovative Data Generation

QVAC Genesis II introduces a novel data generation approach called Option-Level Reasoning. This method analyzes every answer option in multiple-choice questions, reinforcing correct reasoning and addressing common misconceptions. The approach aims to enhance clarity, causality, and decision-making in AI training data.

This complements the Failure Analysis method from Genesis I, forming a dual-method pipeline that ensures educational value in every generated question. Evaluations have shown that models trained on Genesis II data achieve higher reasoning accuracy and produce clearer answers compared to previous datasets.

Commitment to Open AI Research

Tether aims to shift the focus from volume to structure and reasoning in AI training. Paolo Ardoino, CEO of Tether, emphasized the importance of understanding over fluency in AI development. The dataset is available under a Creative Commons Attribution–NonCommercial (CC-BY-NC 4.0) license, supporting open, community-driven AI research.

The release aligns with QVAC’s mission to advance decentralized intelligence, allowing AI models to be trained and deployed without reliance on centralized cloud platforms. This approach seeks to lower innovation barriers and ensure accessible, high-quality AI training data for the global research community.

Further Information

The technical details of the dataset, titled “QVAC Genesis II: Expanding the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for Pre-training,” are available on the QVAC research blog. Additionally, researchers can access the dataset and models on Hugging Face.

Image source: Shutterstock

Source: https://blockchain.news/news/tether-expands-ai-dataset-qvac-genesis-ii

Market Opportunity
Sleepless AI Logo
Sleepless AI Price(AI)
$0,04161
$0,04161$0,04161
+1,06%
USD
Sleepless AI (AI) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.