Close Menu
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning

April 21, 2026

Everbridge Advances Excessive Velocity CEM™ with Dynamically Adaptive Resilience

April 20, 2026

Pervaziv AI Ushers Multicloud Environments into Agentic AI Period throughout AWS, Microsoft Azure, Google Cloud with Cortex 3.5

April 20, 2026
Facebook X (Twitter) Instagram
Smart Homez™
Facebook X (Twitter) Instagram Pinterest YouTube LinkedIn TikTok
SUBSCRIBE
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics
Smart Homez™
Home»Deep Learning»A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning
Deep Learning

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning

Editorial TeamBy Editorial TeamApril 21, 2026Updated:April 21, 2026No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


import subprocess, sys, os, shutil, glob


def pip_install(args):
   subprocess.run([sys.executable, "-m", "pip", "install", "-q", *args],
                  examine=True)


pip_install(["huggingface_hub>=0.26,<1.0"])


pip_install([
   "-U",
   "transformers>=4.49,<4.57",
   "accelerate>=0.33.0",
   "bitsandbytes>=0.43.0",
   "peft>=0.11.0",
   "datasets>=2.20.0,<3.0",
   "sentence-transformers>=3.0.0,<4.0",
   "faiss-cpu",
])


for p in glob.glob(os.path.expanduser(
       "~/.cache/huggingface/modules/transformers_modules/microsoft/Phi-4*")):
   shutil.rmtree(p, ignore_errors=True)


for _m in checklist(sys.modules):
   if _m.startswith(("transformers", "huggingface_hub", "tokenizers",
                     "speed up", "peft", "datasets",
                     "sentence_transformers")):
       del sys.modules[_m]


import json, re, textwrap, warnings, torch
warnings.filterwarnings("ignore")


from transformers import (
   AutoModelForCausalLM,
   AutoTokenizer,
   BitsAndBytesConfig,
   TextStreamer,
   TrainingArguments,
   Coach,
   DataCollatorForLanguageModeling,
)
import transformers
print(f"Utilizing transformers {transformers.__version__}")


PHI_MODEL_ID = "microsoft/Phi-4-mini-instruct"


assert torch.cuda.is_available(), (
   "No GPU detected. In Colab: Runtime > Change runtime kind > T4 GPU."
)
print(f"GPU detected: {torch.cuda.get_device_name(0)}")
print(f"Loading Phi mannequin (native phi3 arch, no distant code): {PHI_MODEL_ID}n")


bnb_cfg = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_compute_dtype=torch.bfloat16,
   bnb_4bit_use_double_quant=True,
)


phi_tokenizer = AutoTokenizer.from_pretrained(PHI_MODEL_ID)
if phi_tokenizer.pad_token_id is None:
   phi_tokenizer.pad_token = phi_tokenizer.eos_token


phi_model = AutoModelForCausalLM.from_pretrained(
   PHI_MODEL_ID,
   quantization_config=bnb_cfg,
   device_map="auto",
   torch_dtype=torch.bfloat16,
)
phi_model.config.use_cache = True


print(f"n✓ Phi-4-mini loaded in 4-bit. "
     f"GPU reminiscence: {torch.cuda.memory_allocated()/1e9:.2f} GB")
print(f"  Structure: {phi_model.config.model_type}   "
     f"(utilizing built-in {kind(phi_model).__name__})")
print(f"  Parameters: ~{sum(p.numel() for p in phi_model.parameters())/1e9:.2f}B")


def ask_phi(messages, *, instruments=None, max_new_tokens=512,
           temperature=0.3, stream=False):
   """Single entry level for all Phi-4-mini inference calls beneath."""
   prompt_ids = phi_tokenizer.apply_chat_template(
       messages,
       instruments=instruments,
       add_generation_prompt=True,
       return_tensors="pt",
   ).to(phi_model.gadget)


   streamer = (TextStreamer(phi_tokenizer, skip_prompt=True,
                            skip_special_tokens=True)
               if stream else None)


   with torch.inference_mode():
       out = phi_model.generate(
           prompt_ids,
           max_new_tokens=max_new_tokens,
           do_sample=temperature > 0,
           temperature=max(temperature, 1e-5),
           top_p=0.9,
           pad_token_id=phi_tokenizer.pad_token_id,
           eos_token_id=phi_tokenizer.eos_token_id,
           streamer=streamer,
       )
   return phi_tokenizer.decode(
       out[0][prompt_ids.shape[1]:], skip_special_tokens=True
   ).strip()


def banner(title):
   print("n" + "=" * 78 + f"n  {title}n" + "=" * 78)



Supply hyperlink

Editorial Team
  • Website

Related Posts

OpenAI Scales Trusted Entry for Cyber Protection With GPT-5.4-Cyber: a High quality-Tuned Mannequin Constructed for Verified Safety Defenders

April 20, 2026

A Coding Information to Construct a Manufacturing-Grade Background Activity Processing System Utilizing Huey with SQLite, Scheduling, Retries, Pipelines, and Concurrency Management

April 17, 2026

OpenAI Launches GPT-Rosalind: Its First Life Sciences AI Mannequin Constructed to Speed up Drug Discovery and Genomics Analysis

April 17, 2026
Misa
Trending
Deep Learning

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning

By Editorial TeamApril 21, 20260

import subprocess, sys, os, shutil, glob def pip_install(args): subprocess.run([sys.executable, “-m”, “pip”, “install”, “-q”, *args], examine=True)…

Everbridge Advances Excessive Velocity CEM™ with Dynamically Adaptive Resilience

April 20, 2026

Pervaziv AI Ushers Multicloud Environments into Agentic AI Period throughout AWS, Microsoft Azure, Google Cloud with Cortex 3.5

April 20, 2026

IWIL Ignite Expands in Canada, Targets 100,000 Future Ability Leaders by 2030 to Strengthen Canada’s AI Workforce

April 20, 2026
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning

April 21, 2026

Everbridge Advances Excessive Velocity CEM™ with Dynamically Adaptive Resilience

April 20, 2026

Pervaziv AI Ushers Multicloud Environments into Agentic AI Period throughout AWS, Microsoft Azure, Google Cloud with Cortex 3.5

April 20, 2026

IWIL Ignite Expands in Canada, Targets 100,000 Future Ability Leaders by 2030 to Strengthen Canada’s AI Workforce

April 20, 2026

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

The Ai Today™ Magazine is the first in the middle east that gives the latest developments and innovations in the field of AI. We provide in-depth articles and analysis on the latest research and technologies in AI, as well as interviews with experts and thought leaders in the field. In addition, The Ai Today™ Magazine provides a platform for researchers and practitioners to share their work and ideas with a wider audience, help readers stay informed and engaged with the latest developments in the field, and provide valuable insights and perspectives on the future of AI.

Our Picks

A Coding Implementation on Microsoft’s Phi-4-Mini for Quantized Inference Reasoning Device Use RAG and LoRA Nice-Tuning

April 21, 2026

Everbridge Advances Excessive Velocity CEM™ with Dynamically Adaptive Resilience

April 20, 2026

Pervaziv AI Ushers Multicloud Environments into Agentic AI Period throughout AWS, Microsoft Azure, Google Cloud with Cortex 3.5

April 20, 2026
Trending

IWIL Ignite Expands in Canada, Targets 100,000 Future Ability Leaders by 2030 to Strengthen Canada’s AI Workforce

April 20, 2026

GE HealthCare expands mammography collaboration with RadNet’s DeepHealth subsidiary to increase international entry to DeepHealth’s AI-powered breast most cancers screening options

April 20, 2026

Sight Machine Advances Autonomous Brokers for Manufacturing with AI Agent Crews

April 20, 2026
Facebook X (Twitter) Instagram YouTube LinkedIn TikTok
  • About Us
  • Advertising Solutions
  • Privacy Policy
  • Terms
  • Podcast
Copyright © The Ai Today™ , All right reserved.

Type above and press Enter to search. Press Esc to cancel.