Close Menu
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

ProviderScout.ai Launches AI Supplier Discovery Platform to Assist Companies Examine AI Instruments Throughout 35 Classes

June 17, 2026

BetterCloud Unveils AI-Native Subsequent Era SaaS Administration Platform to Assist Enterprises Govern AI at Scale

June 17, 2026

Kachilu unveils AI Advertising and marketing Automation App for macOS with Browser-Based mostly Marketing campaign Execution

June 17, 2026
Facebook X (Twitter) Instagram
Smart Homez™
Facebook X (Twitter) Instagram Pinterest YouTube LinkedIn TikTok
SUBSCRIBE
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics
Smart Homez™
Home»Deep Learning»How one can Construct Reminiscence-Environment friendly Transformers with xFormers Utilizing Packed Sequences, GQA, ALiBi, SwiGLU, and Causal Consideration
Deep Learning

How one can Construct Reminiscence-Environment friendly Transformers with xFormers Utilizing Packed Sequences, GQA, ALiBi, SwiGLU, and Causal Consideration

Editorial TeamBy Editorial TeamJune 17, 2026Updated:June 17, 2026No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
How one can Construct Reminiscence-Environment friendly Transformers with xFormers Utilizing Packed Sequences, GQA, ALiBi, SwiGLU, and Causal Consideration
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


print("n" + "="*70 + "n4. Variable-length packed batch — no padding wasten" + "="*70)
seqlens = [37, 120, 8, 200]
whole = sum(seqlens)
H, Okay = 8, 64
q = torch.randn(1, whole, H, Okay, gadget=gadget, dtype=torch.float16)
ok = torch.randn(1, whole, H, Okay, gadget=gadget, dtype=torch.float16)
v = torch.randn(1, whole, H, Okay, gadget=gadget, dtype=torch.float16)
attempt:
   bias = ab.BlockDiagonalMask.from_seqlens(seqlens)
   out_packed = xops.memory_efficient_attention(q, ok, v, attn_bias=bias)
   s0 = seqlens[0]
   ref0 = vanilla_attention(q[:, :s0], ok[:, :s0], v[:, :s0]).half()
   print("packed form         :", tuple(out_packed.form), "(all", whole, "tokens, no pad)")
   print("segment-0 max diff   : {:.2e}".format((out_packed[:, :s0] - ref0).abs().max().merchandise()))
   cbias = ab.BlockDiagonalCausalMask.from_seqlens(seqlens)
   _ = xops.memory_efficient_attention(q, ok, v, attn_bias=cbias)
   print("-> additionally did a packed CAUSAL move. That is how vLLM-style engines")
   print("   batch requests of various lengths with zero padding overhead.")
   splits = bias.break up(out_packed)
   print("recovered segments   :", [tuple(t.shape) for t in splits])
besides Exception as e:
   print("BlockDiagonalMask path skipped on this model/backend:", repr(e))
print("n" + "="*70 + "n5. Grouped-query consideration (5-D BMGHK format)n" + "="*70)
B, M, Okay = 2, 256, 64
n_q_heads, n_kv_heads = 8, 2
G, Hq = n_kv_heads, n_q_heads // n_kv_heads
attempt:
   qg = torch.randn(B, M, G, Hq, Okay, gadget=gadget, dtype=torch.float16)
   kg = torch.randn(B, M, G, 1,  Okay, gadget=gadget, dtype=torch.float16)
   vg = torch.randn(B, M, G, 1,  Okay, gadget=gadget, dtype=torch.float16)
   out_gqa = xops.memory_efficient_attention(qg, kg, vg)
   print("GQA output form     :", tuple(out_gqa.form), "= [B, M, G, Hq, K]")
   print(f"-> {n_q_heads} question heads, solely {n_kv_heads} KV heads: smaller KV-cache,")
   print("   which is precisely what Llama-/Mistral-class fashions use at inference.")
besides Exception as e:
   print("GQA 5-D path skipped on this model/backend:", repr(e))



Supply hyperlink

Editorial Team
  • Website

Related Posts

A Coding Implementation on MONAI for Finish-to-Finish 3D Spleen Segmentation Utilizing UNet on Medical CT Volumes

June 12, 2026

Pace Up Transformer Coaching Utilizing NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp

June 2, 2026

Nous Analysis Proposes Lighthouse Consideration: A Coaching-Solely Choice-Based mostly Hierarchical Consideration That Delivers 1.4–1.7× Pretraining Speedup at Lengthy Context

May 16, 2026
Misa
Trending
Machine-Learning

ProviderScout.ai Launches AI Supplier Discovery Platform to Assist Companies Examine AI Instruments Throughout 35 Classes

By Editorial TeamJune 17, 20260

Platform combines AI supplier profiles, class analysis, Scout Rating, and an AI-powered discovery assistant to…

BetterCloud Unveils AI-Native Subsequent Era SaaS Administration Platform to Assist Enterprises Govern AI at Scale

June 17, 2026

Kachilu unveils AI Advertising and marketing Automation App for macOS with Browser-Based mostly Marketing campaign Execution

June 17, 2026

Klient Launches MCP Servers for Salesforce PSA — Letting Any AI Run Actual Work Inside Your Salesforce Org

June 17, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

ProviderScout.ai Launches AI Supplier Discovery Platform to Assist Companies Examine AI Instruments Throughout 35 Classes

June 17, 2026

BetterCloud Unveils AI-Native Subsequent Era SaaS Administration Platform to Assist Enterprises Govern AI at Scale

June 17, 2026

Kachilu unveils AI Advertising and marketing Automation App for macOS with Browser-Based mostly Marketing campaign Execution

June 17, 2026

Klient Launches MCP Servers for Salesforce PSA — Letting Any AI Run Actual Work Inside Your Salesforce Org

June 17, 2026

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

The Ai Today™ Magazine is the first in the middle east that gives the latest developments and innovations in the field of AI. We provide in-depth articles and analysis on the latest research and technologies in AI, as well as interviews with experts and thought leaders in the field. In addition, The Ai Today™ Magazine provides a platform for researchers and practitioners to share their work and ideas with a wider audience, help readers stay informed and engaged with the latest developments in the field, and provide valuable insights and perspectives on the future of AI.

Our Picks

ProviderScout.ai Launches AI Supplier Discovery Platform to Assist Companies Examine AI Instruments Throughout 35 Classes

June 17, 2026

BetterCloud Unveils AI-Native Subsequent Era SaaS Administration Platform to Assist Enterprises Govern AI at Scale

June 17, 2026

Kachilu unveils AI Advertising and marketing Automation App for macOS with Browser-Based mostly Marketing campaign Execution

June 17, 2026
Trending

Klient Launches MCP Servers for Salesforce PSA — Letting Any AI Run Actual Work Inside Your Salesforce Org

June 17, 2026

How one can Construct Reminiscence-Environment friendly Transformers with xFormers Utilizing Packed Sequences, GQA, ALiBi, SwiGLU, and Causal Consideration

June 17, 2026

Konecta Launches Kolibri, an Agentic Platform, to Pace Up Enterprise Deployment of Agentic AI and Finish ‘Pilot Purgatory’

June 16, 2026
Facebook X (Twitter) Instagram YouTube LinkedIn TikTok
  • About Us
  • Advertising Solutions
  • Privacy Policy
  • Terms
  • Podcast
Copyright © The Ai Today™ , All right reserved.

Type above and press Enter to search. Press Esc to cancel.