Mushfiq Rahman

Mushfiq Rahman

AI Infrastructure, LLM Systems

How I Built llm0: An LLM Gateway Architecture Walk-Through

Written by

Mushfiq Rahman

on

May 6, 2026

From a weekend Python prototype to a 3 ms p50 Go gateway with Redis Lua scripts, pgvector semantic caching, and cross-provider failover. Why I built it A team I follow…
Digital Pathology, Software Engineering

go-openslide: Modern Go Bindings for OpenSlide 4.0

Written by

Mushfiq Rahman

on

April 8, 2026

I recently shipped v0.1.0 of go-openslide — a CGO binding library for OpenSlide 4.0, the standard C library for reading whole slide images (WSI) in digital pathology. Here’s what it…
Software Engineering

Why I’d Choose a Modular Monolith Over Microservices

Written by

Mushfiq Rahman

on

March 22, 2026

This Is an Opinion This is not a tutorial — just a perspective shaped by building and operating production systems. I’ve worked on systems that needed to process data reliably,…
AI Infrastructure, LLM Systems

Benchmarking Mistral-7B on RTX 4090: vLLM Concurrency, Latency, and Capacity Planning

Written by

Mushfiq Rahman

on

March 5, 2026

Most LLM benchmarks highlight peak tokens per second. That metric is incomplete. For production systems — especially agentic workloads — what actually matters is: • Where is the latency–throughput knee?•…