Logo
About Us
Sponsor Us
Github Repo
Search
Log In
Subscribe
Oliver Buchannon
Mohinish S

Serverless Ventures | Cloud, Data & Distributed Systems | Angel & Advisor | Infra & Data Startups

SnackOnAI Blog

RelBench: The Benchmark That Makes Manual Feature Engineering on Databases Look Embarrassing

May 2, 2026

•

12 min read

RelBench: The Benchmark That Makes Manual Feature Engineering on Databases Look Embarrassing

Every major enterprise ML pipeline touches a relational database. Almost none of them use end-to-end deep learning on it. RelBench exists to change that, and its user study result is the most important number in the paper: RDL reduces human effort by more than an order of magnitude while matching or beating manually engineered features.

Mohinish S
Mohinish S

SnackOnAI Blog

Anthropic Builds a C Compiler with AI Agents

May 1, 2026

•

13 min read

Anthropic Builds a C Compiler with AI Agents

16 Claude agents. 2,000 sessions. $20,000 in API costs. 100,000 lines of Rust. A working C compiler that boots Linux on three architectures. And the engineer who ran it says the result leaves him "feeling uneasy."

Mohinish S
Mohinish S

SnackOnAI Blog

PaperBanana: The Agentic Figure Factory for AI Research Papers

Apr 30, 2026

•

12 min read

PaperBanana: The Agentic Figure Factory for AI Research Papers

Academic figure generation is not a creative problem. It is an information extraction, style normalization, and iterative refinement problem. PaperBanana treats it exactly that way, and the benchmark results show why that framing wins.

Mohinish S
Mohinish S

SnackOnAI Blog

Cross-Layer Transcoders: The Interpretability Tool That Might Be Lying to You

Apr 29, 2026

•

13 min read

Cross-Layer Transcoders: The Interpretability Tool That Might Be Lying to You

The interpretability tool designed to show you how a language model reasons can, under certain training conditions, produce circuits that match behavior while hiding the actual computation. This is not a minor caveat. It is a structural failure mode of the architecture.

Mohinish S
Mohinish S

SnackOnAI Blog

Aurum: The Local AI Stack You Can Build in an Afternoon

Apr 28, 2026

•

12 min read

Aurum: The Local AI Stack You Can Build in an Afternoon

Home inventory management is the most mundane problem AI has ever solved. It is also a perfect specification document for building a fully local, privacy-first, multi-modal AI application from scratch.

Mohinish S
Mohinish S

SnackOnAI Blog

Qwen3-TTS: Voice AI Without The Cloud

Apr 27, 2026

•

11 min read

Qwen3-TTS: Voice AI Without The Cloud

Voice cloning from 3 seconds of audio, running locally, at 97ms first-packet latency, open-source under Apache 2.0. The cloud TTS business model just got a serious challenge.

Mohinish S
Mohinish S

SnackOnAI Blog

Ollama: The Easiest Way to Run Powerful AI Models Locally

Apr 26, 2026

•

12 min read

Ollama: The Easiest Way to Run Powerful AI Models Locally

Ollama is not an inference engine. It is a model manager with an inference engine hidden inside it. The community conflates these, and the confusion costs them performance.

Mohinish S
Mohinish S

SnackOnAI Blog

Flash Attention: The Engine Behind Efficient Transformers

Apr 25, 2026

•

12 min read

Flash Attention: The Engine Behind Efficient Transformers

FlashAttention does more FLOPs than standard attention and is still faster. That is not a bug. It is the entire point.

Mohinish S
Mohinish S

SnackOnAI Blog

MLC-LLM: Run Any LLM on Any Device

Apr 24, 2026

•

10 min read

MLC-LLM: Run Any LLM on Any Device

The hard part of deploying LLMs on phones and browsers is not the model. It is the compiler. MLC-LLM solved the compiler.

Mohinish S
Mohinish S

SnackOnAI Blog

llm.c : Minimal LLM Trainer

Apr 23, 2026

•

11 min read

llm.c : Minimal LLM Trainer

You do not need PyTorch to train GPT-2. You never did. Andrej Karpathy just proved it in 1,000 lines of C.

Mohinish S
Mohinish S

SnackOnAI Blog

Transformers: The Library That Changed AI

Apr 22, 2026

•

11 min read

Transformers: The Library That Changed AI

The most influential ML library ever written violates every software engineering principle on purpose. That is the point.

Mohinish S
Mohinish S

SnackOnAI Blog

Web World Models: The Web Stack Is the World Model

Apr 21, 2026

•

10 min read

Web World Models: The Web Stack Is the World Model

Web stacks were never meant to simulate reality. Princeton just showed they already do.

Mohinish S
Mohinish S

SnackOnAI Blog

FastChat: Open-Source Chatbot Platform for Modern AI

Apr 20, 2026

•

11 min read

FastChat: Open-Source Chatbot Platform for Modern AI

FastChat's Real Output Was Never Tokens. It Was 1.5 Million Human Votes.

Mohinish S
Mohinish S

SnackOnAI Blog

Exo: A Distributed Inference Engine Runs a Trillion-Parameter Model on a Thunderbolt Cable

Apr 19, 2026

•

12 min read

Exo: A Distributed Inference Engine Runs a Trillion-Parameter Model on a Thunderbolt Cable

How Exo Replaced a $780,000 GPU Stack With Four Mac Studios and One Open Source Daemon

Mohinish S
Mohinish S

SnackOnAI Blog

llama.cpp: Bringing Large Language Models to Every Computer

Apr 18, 2026

•

10 min read

llama.cpp: Bringing Large Language Models to Every Computer

No Dependencies, No Excuses: How C and a Custom File Format Put 70B Models on Your Laptop

Mohinish S
Mohinish S

SnackOnAI Blog

whisper.cpp: The Speech Recognition Engine That Runs Where Python Won't Go

Apr 17, 2026

•

10 min read

whisper.cpp: The Speech Recognition Engine That Runs Where Python Won't Go

The Architecture That Runs OpenAI's Whisper on a $35 Device

Mohinish S
Mohinish S

SnackOnAI Blog

Unsloth: Fine-Tuning Is Not Slow. Your Kernels Are.

Apr 14, 2026

•

17 min read

Unsloth: Fine-Tuning Is Not Slow. Your Kernels Are.

Your Kernels Are Slow. Here's What Unsloth Did About It.

Mohinish S
Mohinish S

SnackOnAI Blog

vLLM: The Inference Engine You Think You Understand But Probably Don't

Apr 12, 2026

•

15 min read

vLLM: The Inference Engine You Think You Understand But Probably Don't

How UC Berkeley Stole 50 Years of OS Research and Made LLM Serving 24x Faster

Mohinish S
Mohinish S

SnackOnAI Blog

AIBrix: The New Frontier of Cloud-Native LLM Orchestration

Apr 11, 2026

•

17 min read

AIBrix: The New Frontier of Cloud-Native LLM Orchestration

The missing orchestration layer that makes vLLM production-ready at scale

Mohinish S
Mohinish S

SnackOnAI Blog

Nebula-S, SVMS1-4B, and the On-Device Reasoning Architecture Nobody Is Talking About Correctly

Apr 10, 2026

•

16 min read

Nebula-S, SVMS1-4B, and the On-Device Reasoning Architecture Nobody Is Talking About Correctly

Everyone is arguing about which 70B model wins on MMLU. Meanwhile, a small team in Austin is quietly stacking a multi-stream reasoning architecture on top of Qwen3-4B and claiming it beats models twice its size on edge hardware. Let's crack it open.

Mohinish S
Mohinish S

SnackOnAI Blog

TensorRT-LLM: NVIDIA's Inference Compiler Is Not What You Think It Is

Apr 9, 2026

•

17 min read

TensorRT-LLM: NVIDIA's Inference Compiler Is Not What You Think It Is

The inference compiler NVIDIA doesn't want you to call a compiler

Mohinish S
Mohinish S

SnackOnAI Blog

The Ralph Loop: Why a Bash While Loop Is One of the Most Honest Architectures in Agentic AI

Apr 8, 2026

•

16 min read

The Ralph Loop: Why a Bash While Loop Is One of the Most Honest Architectures in Agentic AI

A PRD-driven, test-gated, self-correcting autonomous coding loop built on Claude Code. Deceptively simple. Surprisingly powerful. And the industry's best-kept design pattern hiding in plain sight.

Mohinish S
Mohinish S

SnackOnAI Blog

LMCache & SGLang: The KV Cache Stack Your LLM Inference Deserves

Apr 7, 2026

•

16 min read

LMCache & SGLang: The KV Cache Stack Your LLM Inference Deserves

How two open-source systems redesign inference from the ground up, cutting TTFT by 10×, throughput by 15×, and treating KV cache as a first-class citizen of your serving stack.

Mohinish S
Mohinish S

SnackOnAI Blog

So You Want to Build the Next Medvi? Here Is What It Actually Costs

Apr 5, 2026

•

16 min read

So You Want to Build the Next Medvi? Here Is What It Actually Costs

The Complete Founder's Blueprint for Launching a Medvi-Style Telehealth Brand in 2026

Mohinish S
Mohinish S

SnackOnAI Blog

The $401M Company Built by Two People and a Dozen AI Tools

Apr 4, 2026

•

13 min read

The $401M Company Built by Two People and a Dozen AI Tools

How Medvi.org Rewrote the Rules of What a Business Can Be

Mohinish S
Mohinish S
Load more

Quick Links

Subscription

Search

Socials

© 2026 Snack On AI.
Report abusePrivacy policyTerms of use
beehiivPowered by beehiiv