Deploy LLMs on Kubernetes with vLLM: GPU node setup, NVIDIA device plugin, KEDA autoscaling, multi-model serving, and …