Tag

Kubernetes

3 posts on Kubernetes

April 10, 20266 min read
Self-Hosting an LLM on Kubernetes
Managed inference APIs are convenient until they are not. Here is the full picture of running your own LLM on Kubernetes: GPU scheduling, model storage, vLLM vs Ollama, and the operational tradeoffs.
kubernetes llm ai gpu infrastructure
March 20, 202612 min read
Why I Run Qdrant in Production: A 3-Node Cluster vs the Alternatives
Pinecone, Weaviate, Milvus, pgvector, Qdrant — five viable choices for a vector database. Here is why I picked Qdrant for production, how the 3-node cluster is laid out, and what the other options actually trade away.
vector-search qdrant rag infrastructure kubernetes
March 12, 20265 min read
Docker Gets You to Production. Kubernetes Keeps You There.
Docker solves the packaging problem. Kubernetes solves the operational problem. Here is what K8s actually adds, how its core objects work, and why rolling updates change how you think about deployments.
kubernetes docker devops infrastructure

Self-Hosting an LLM on Kubernetes