Tether is stepping deeper into artificial intelligence with a new release that could significantly lower the barrier to training large language models.
The company, best known for issuing Tether (USDT), has introduced a framework within its QVAC platform that enables AI models to be fine-tuned on everyday consumer devices—including smartphones and non-Nvidia GPUs.
Key Takeaways
- Tether introduced a new QVAC framework that enables large AI models to be fine-tuned on smartphones and consumer-grade hardware.
- The integration of Microsoft BitNet significantly reduces memory usage, making it possible to run billion-parameter models on limited devices.
- The system removes reliance on Nvidia by supporting AI training across chips from AMD, Intel, Apple, and mobile GPUs.
- Benchmarks show that AI models can be trained in minutes to hours on smartphones, with faster inference performance on mobile GPUs compared to CPUs.
- The development aligns with a broader trend of crypto firms expanding into AI infrastructure and decentralized, on-device computing solutions.
A Shift Away From Expensive AI Infrastructure
Training modern AI systems has long required access to specialized hardware, particularly GPUs from Nvidia, or costly cloud infrastructure. Tether’s latest development aims to change that dynamic.
Built on Microsoft BitNet and combined with LoRA (Low-Rank Adaptation) techniques, the framework dramatically reduces memory and compute requirements. According to the company, this allows models with up to 1 billion parameters to be fine-tuned on smartphones in under two hours, with smaller models completing training in minutes.
The system is designed to run across a wide range of hardware, including chips from Intel, AMD, and Apple, as well as mobile GPUs from Qualcomm and Apple devices.
In benchmark tests, Tether engineers reported that a 125 million parameter model could be trained in roughly 10 minutes on a flagship Android device, while a 1 billion parameter model completed similar tasks in just over an hour on both Android and iOS devices.
The team also pushed the limits further, demonstrating that models as large as 13 billion parameters could be fine-tuned on a smartphone.
Efficiency Gains Through 1-Bit Architecture
A key driver behind this performance is BitNet’s 1-bit model design, which significantly cuts memory usage compared to traditional 16-bit systems. Tether claims its approach can reduce VRAM requirements by as much as 77.8%, freeing up resources for larger models to run on limited hardware.
The framework also improves inference speeds. On mobile GPUs, BitNet-based models reportedly run between two and eleven times faster than CPUs, suggesting that modern smartphones are increasingly capable of handling workloads once reserved for data centers.
Another notable milestone is the framework’s compatibility with non-Nvidia hardware for LoRA fine-tuning — something that has historically been limited to Nvidia’s ecosystem. This broadens access for developers working with alternative chipsets and consumer-grade devices.
Decentralizing AI Development
Tether is positioning the release as part of a broader push toward decentralizing AI. By enabling on-device training, the system allows sensitive data to remain local rather than being sent to centralized servers. This opens the door to federated learning, where models can be updated across distributed devices without compromising privacy.
Paolo Ardoino, CEO of Tether, emphasized the broader implications of this shift:
“Intelligence will be a key determining factor in the future of society. It has the potential to improve the stability of society, serve as connective tissue, or further empower the few. The future of AI should be accessible, available, and open to people and builders everywhere, and it should not require an absurd amount of resources only available to a handful of cloud providers.”
He added that reliance on centralized infrastructure risks slowing innovation and concentrating power, while on-device capabilities could make AI development more inclusive.
“By enabling meaningful large-model training on consumer hardware, including smartphones, Tether’s QVAC is proving that advanced AI can be decentralized, inclusive, and empowering for everyone.”
Crypto Firms Double Down on AI
Tether’s announcement reflects a broader trend across the crypto sector, where companies are increasingly investing in AI infrastructure and high-performance computing.
Mining firms, in particular, have begun repurposing their hardware for AI workloads. Cipher Mining recently secured a multibillion-dollar agreement tied to AI data center capacity, while Core Scientific obtained a major credit facility to expand its infrastructure. Meanwhile, HIVE Digital Technologies has reported rising revenues driven in part by AI-focused operations.
At the same time, AI agents — autonomous programs capable of executing transactions and interacting with services — are gaining traction within blockchain ecosystems. Platforms like Coinbase and Alchemy have introduced tools that allow these agents to operate onchain, while initiatives backed by firms such as Franklin Templeton are exploring enterprise use cases.
What This Means for Developers
If Tether’s claims hold up under wider adoption, the implications could be significant. Developers who previously relied on expensive GPU clusters may soon be able to train and customize models locally using devices they already own.
This could accelerate experimentation, particularly in regions or communities with limited access to high-end infrastructure. It also aligns with growing concerns around data privacy, as on-device processing reduces the need to transmit sensitive information to third-party servers.
Still, questions remain about scalability, real-world performance across diverse workloads, and how the framework compares to established AI training pipelines.
What is clear, however, is that the line between consumer hardware and enterprise AI capability is beginning to blur. With QVAC’s BitNet LoRA framework, Tether is betting that the future of AI development will be far more distributed — and far more accessible — than it is today.

