Hannes Hapke

404

The requested page could not be found.

Speculative Decoding with vLLM using Gemma

Hannes Hapke 28 Feb 2025

Improving LLM inferences with speculative decoding using Gemma

10 min read

Deploying Google's Gemma on Vertex AI

Hannes Hapke 17 Feb 2025

A comprehensive guide to deploying Google's Gemma language model on Vertex AI using vLLM, covering model registration, endpoint creation, and production deployment best practices.

11 min read

Speculative Decoding with vLLM

Hannes Hapke 11 Jan 2025

Improving LLV inferences with speculative decoding

404

Recent Posts

Speculative Decoding with vLLM using Gemma

Deploying Google's Gemma on Vertex AI

Speculative Decoding with vLLM