A new open-source artificial intelligence platform, Gemma : announced by Google
Based on Gemini technology, Google unveiled an open source AI that is laptop-friendly and can be used to create chatbots and content creation tools.
Based on the technology behind Gemini, Google has developed an open source big language model that is lightweight and powerful, designed to be utilized in resource-constrained situations such as cloud infrastructure or laptops.
Gemma may be used to build almost anything a language model can do, including chatbots and content creation tools. SEOs have been waiting for this technology for a long time.
Two versions, one with two billion parameters (2B) and the other with seven billion parameters (7B), are available for purchase. The complexity and potential capabilities of the model are indicated by the number of parameters. More parameterized models are capable of producing more complex answers and a deeper understanding of language, but their training and operation cost more.
By making Gemma available to the public, we hope to increase accessibility to cutting-edge artificial intelligence that is already trained to be responsible and safe, and that can be further optimized for safety using a toolset.
Gemma By DeepMind
The lightweight and efficient design of the model makes it perfect for distributing to a larger number of end users.
The following main features were mentioned in Google’s official announcement:
- “We’re providing model weights in two sizes, Gemma 2B and Gemma 7B. There are trained and fine-tuned versions available for every size
- Using Gemma, a new Responsible Generative AI Toolkit offers instructions and necessary resources for developing safer AI applications.
- We are offering native Keras 3.0 toolchains for inference and supervised fine-tuning (SFT) across all key frameworks, including PyTorch, TensorFlow, and JAX.
- All set to go Gemma is simple to get started with because to its interaction with popular tools like Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM, as well as Colab and Kaggle notebooks.
- Pre-trained and fine-tuned Gemma models may be easily deployed on Vertex AI and Google Kubernetes Engine (GKE) and operate on your laptop, desktop, or Google Cloud.
- Industry-leading performance is ensured through optimization across different AI hardware platforms, such as NVIDIA GPUs and Google Cloud TPUs.
- The terms of use allow any organization, regardless of size, to distribute and use commercial content responsibly.
Examining Gemma
Awni Hannun, an Apple machine learning research scientist, conducted an analysis and found that Gemma is highly efficient and can be used in low-resource scenarios.
Gemma has a vocabulary of 250,000 (250k) tokens, compared to 32k for similar models, according to Hannun. That’s important because it means Gemma can handle tasks involving complex language because it can detect and understand a larger range of words. According to his analysis, this large vocabulary improves the model’s adaptability to various content types. He thinks it might be beneficial for arithmetic, coding, and other modalities as well.
Additionally, the enormous “embedding weights” (750 million) were mentioned. The parameters that aid in the mapping of words to representations of their meanings and relationships are referenced by the embedding weights.
He emphasized the use of embedding weights, which encode intricate details about word relationships and meanings, not only in processing the input portion but also in producing the model’s output. By enabling the model to more effectively use its linguistic understanding while generating text, this sharing increases the model’s efficiency.
This increases the model’s application in chatbots, translations, and conetent creation by giving end users more accurate, relevant, and contextually suitable responses (content) from the model.
He posted on Twitter:
“When compared to other open-source models, the vocabulary is enormous: 32k for Mistral 7B against 250K
Perhaps very helpful for arithmetic, coding, and other modalities with a long tail of symbols.
Furthermore, because the embedding weights are large (~750M params), the output head shares them. “
In a subsequent tweet, he also mentioned a training optimization that could lead to more precise and sophisticated model responses since it makes it possible for the model to learn and adapt during the training process more successfully.
He posted on Twitter:
“There is a unit offset in the RMS norm weight.
They do “x * (1 + weight)” in place of “x * weight.”
This seems to be an optimization for training. Though they most likely start close to 0, the weight is normally initialized at 1. In line with all other parameters. “
He went on to say that while there are more optimizations in training and data, those two aspects are what really jumped out.
Made To Be Responsible And Safe
Its safe design from the start makes it perfect for usage and deployment, which is a crucial characteristic. Sensitive and personal data was filtered out of the training data. Reward learning from human feedback (RLHF) was another technique Google utilized to train the model for ethical conduct.
It was further debugged using automated testing, manual re-teaming, and capability checks for undesirable and hazardous behaviors.
Google has also made available a set of tools to assist end users in enhancing their safety:
- Together with Gemma, we’re also launching a brand-new Responsible Generative AI Toolkit to assist researchers and developers in prioritizing the creation of responsible and safe AI applications. Included in the toolset are:
- Safety classification: Using few instances, we present a novel approach to construct reliable safety classifiers.
- Debugging: To look into Gemma’s behavior and fix possible problems, use a model debugging tool.
- Best practices for model builders are available to you, drawing from Google’s vast experience in creating and implementing big language models.
Conclusion
Gemma by Google refers to a collection of “open models” introduced by Google, which enables individuals and businesses to develop their own AI software. In line with the actions taken by Meta Platforms and other companies, Google has released these new AI models, allowing external developers to utilize and customize them according to their requirements.