Close Menu
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

Conduent Launches AI Expertise Middle to Showcase AI & GenAI-Powered Options for Industrial, Transportation and Authorities Purchasers

January 16, 2026

Newo.ai Companions with IONOS to Ship AI Receptionists for Small Companies

January 16, 2026

TeqBlaze Presents TeqMate AI — An Clever Assistant Bringing Automation to AdOps Operations

January 16, 2026
Facebook X (Twitter) Instagram
Smart Homez™
Facebook X (Twitter) Instagram Pinterest YouTube LinkedIn TikTok
SUBSCRIBE
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics
Smart Homez™
Home»Interviews»VAST Knowledge Redesigns AI Inference Structure for the Agentic Period with NVIDIA
Interviews

VAST Knowledge Redesigns AI Inference Structure for the Agentic Period with NVIDIA

Editorial TeamBy Editorial TeamJanuary 6, 2026Updated:January 7, 2026No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
VAST Knowledge Redesigns AI Inference Structure for the Agentic Period with NVIDIA
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


VAST AI Working System working natively on NVIDIA BlueField-4 DPUs collapses legacy storage tiers to ship shared, pod-scale KV cache with deterministic entry for long-context, multi-turn and multi-agent inference.

VAST Knowledge, the AI Working System firm, introduced a brand new inference structure that allows the NVIDIA Inference Context Reminiscence Storage Platform  deployments for the period of long-lived, agentic AI. The platform is a brand new class of AI-native storage infrastructure for gigascale inference. Constructed on NVIDIA BlueField-4 DPUs and Spectrum-X Ethernet networking, it accelerates AI-native key-value (KV) cache entry, permits high-speed inference context sharing throughout nodes, and delivers a serious leap in energy effectivity.

As inference evolves from single prompts into persistent, multi-turn reasoning throughout brokers, the notion that context stays native breaks down. Efficiency is more and more ruled by how effectively inference historical past (KV cache) might be saved, restored, reused, prolonged, and shared underneath sustained load – not just by how briskly GPUs can compute.

VAST is rebuilding the inference information path by working VAST AI Working System (AI OS) software program natively on NVIDIA BlueField-4 DPUs, embedding essential information providers straight into the GPU server the place inference executes, in addition to in a devoted information node structure. This design removes traditional client-server competition and eliminates pointless copies and hops that inflate time-to-first-token (TTFT) as concurrency rises. Mixed with VAST’s parallel Disaggregated Shared-All the pieces (DASE) structure, every host can entry a shared, globally coherent context namespace with out the coordination tax that causes bottlenecks at scale, enabling a streamlined path from GPU reminiscence to persistent NVMe storage over RDMA materials.

Additionally Learn: AiThority Interview That includes: Pranav Nambiar, Senior Vice President of AI/ML and PaaS at DigitalOcean

“Inference is changing into a reminiscence system, not a compute job. The winners gained’t be the clusters with probably the most uncooked compute – they’ll be those that may transfer, share, and govern context at line price,” stated John Mao, Vice President, World Expertise Alliances at VAST Knowledge “Continuity is the brand new efficiency frontier. If context isn’t accessible on demand, GPUs idle and economics collapse. With the VAST AI Working System on NVIDIA BlueField-4, we’re turning context into shared infrastructure   quick by default, policy-driven when wanted, and constructed to remain predictable as agentic AI scales.”

Past uncooked efficiency, VAST provides AI native organizations and enterprises deploying NVIDIA AI factories a path to production-grade inference coordination with excessive ranges of effectivity and safety. As inference strikes from experimentation into regulated and revenue-driving providers, groups want the flexibility to handle context with coverage, isolation, auditability, lifecycle controls, and elective safety – all whereas retaining KV cache quick and usable as a shared system useful resource. VAST delivers these AI-native information providers as a part of the AI OS, serving to clients keep away from rebuild storms, scale back idle-GPU useful resource waste, and enhance infrastructure effectivity as context sizes and session concurrency explode.

“Context is the gas of pondering. Similar to people that write issues down to recollect them, AI brokers want to avoid wasting their work to allow them to reuse what they’ve realized,” stated Kevin Deierling, Senior Vice President of Networking, NVIDIA. “Multi-turn and multi-user inferencing essentially transforms how context reminiscence is managed at scale. VAST Knowledge AI OS with NVIDIA BlueField-4 permits the NVIDIA Inference Context Reminiscence Storage Platform and a coherent information aircraft designed for sustained throughput and predictable efficiency as agentic workloads scale.”

Additionally Learn: The Finish Of Serendipity: What Occurs When AI Predicts Each Alternative?

[To share your insights with us, please write to psen@itechseries.com]



Supply hyperlink

Editorial Team
  • Website

Related Posts

Newo.ai Companions with IONOS to Ship AI Receptionists for Small Companies

January 16, 2026

TeqBlaze Presents TeqMate AI — An Clever Assistant Bringing Automation to AdOps Operations

January 16, 2026

Dropzone AI Closes 2025 with 11x ARR Progress, Fortune Cyber 60 Recognition, and $37M Collection B

January 16, 2026
Misa
Trending
Machine-Learning

Conduent Launches AI Expertise Middle to Showcase AI & GenAI-Powered Options for Industrial, Transportation and Authorities Purchasers

By Editorial TeamJanuary 16, 20260

New Middle Demonstrates How AI Can Drive Enterprise Efficiency, Improve Buyer Expertise and Enhance Monetary…

Newo.ai Companions with IONOS to Ship AI Receptionists for Small Companies

January 16, 2026

TeqBlaze Presents TeqMate AI — An Clever Assistant Bringing Automation to AdOps Operations

January 16, 2026

Ternary and Alvin Announce Strategic Partnership to Optimize Google Cloud and BigQuery Spend

January 16, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

Conduent Launches AI Expertise Middle to Showcase AI & GenAI-Powered Options for Industrial, Transportation and Authorities Purchasers

January 16, 2026

Newo.ai Companions with IONOS to Ship AI Receptionists for Small Companies

January 16, 2026

TeqBlaze Presents TeqMate AI — An Clever Assistant Bringing Automation to AdOps Operations

January 16, 2026

Ternary and Alvin Announce Strategic Partnership to Optimize Google Cloud and BigQuery Spend

January 16, 2026

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

The Ai Today™ Magazine is the first in the middle east that gives the latest developments and innovations in the field of AI. We provide in-depth articles and analysis on the latest research and technologies in AI, as well as interviews with experts and thought leaders in the field. In addition, The Ai Today™ Magazine provides a platform for researchers and practitioners to share their work and ideas with a wider audience, help readers stay informed and engaged with the latest developments in the field, and provide valuable insights and perspectives on the future of AI.

Our Picks

Conduent Launches AI Expertise Middle to Showcase AI & GenAI-Powered Options for Industrial, Transportation and Authorities Purchasers

January 16, 2026

Newo.ai Companions with IONOS to Ship AI Receptionists for Small Companies

January 16, 2026

TeqBlaze Presents TeqMate AI — An Clever Assistant Bringing Automation to AdOps Operations

January 16, 2026
Trending

Ternary and Alvin Announce Strategic Partnership to Optimize Google Cloud and BigQuery Spend

January 16, 2026

Dropzone AI Closes 2025 with 11x ARR Progress, Fortune Cyber 60 Recognition, and $37M Collection B

January 16, 2026

GitLab Pronounces the Normal Availability of GitLab Duo Agent Platform

January 16, 2026
Facebook X (Twitter) Instagram YouTube LinkedIn TikTok
  • About Us
  • Advertising Solutions
  • Privacy Policy
  • Terms
  • Podcast
Copyright © The Ai Today™ , All right reserved.

Type above and press Enter to search. Press Esc to cancel.