Reshare: RecurrentGemma - Open weights language model from Google DeepMind, based on Griffin.

lqdev☀04/10/2024

https://github.com/google-deepmind/recurrentgemma

RecurrentGemma is a family of open-weights Language Models by Google DeepMind, based on the novel Griffin architecture. This architecture achieves fast inference when generating long sequences by replacing global attention with a mixture of local attention and linear recurrences.

This repository contains the model implementation and examples for sampling and fine-tuning. We recommend most users adopt the Flax implementation, which is highly optimized. We also provide an un-optimized PyTorch implementation for reference.

Permalink: /feed/google-deepmind-recurrent-gemma/

Tags: #ai #gemma #google #llm #opensource #slm #griffin #neuralnetwork

Back to feed

Send me a message or webmention