Sign up
Log in
Sign up
Log in
New
Learn about the optimization techniques powering OctoStack's 10x performance boost
See how
Home
Blog

Announcing availability of Llama 2-Chat on OctoAI

Blog Author - Jason Knight

Jul 20, 2023

3 minutes

This week, Meta released Llama 2, the highly anticipated commercially usable successor to the original LLaMA family of large language models (LLMs). We are happy to announce that Llama 2-Chat 7B is now available on OctoAI. You can get started with Llama 2-Chat from your OctoAI account today.

Growth of open source LLMs, LLaMA and Llama 2

Previously, we’ve discussed the growth of open LLMs and the emergence of these as a viable alternative to proprietary hosted models. Meta’s initial launch of the Large Language Model Meta AI (LLaMA) family was critical to this momentum. Their demonstration that training LLMs far beyond what was considered optimal at the time (i.e. according to the Chinchilla scaling laws) led to models that performed much better than any previous models at the three sizes Meta released (and even larger open models at the time). LLaMA also formed the basis for other open models like Vicuna and Alpaca, fine tuned from LLaMA for chat and instruction following respectively.

But a limitation of LLaMA was that its licensing did not include free use for commercial use cases. Llama 2 comes with updated licensing that allows for free commercial use, unlocking its potential for application in a broad range of end user applications. A detailed description of the allowances and limitations within commercial usage is available in the Llama Community License Agreement on the Llama 2 download page.

The Llama 2 family includes two sets of models, both available in three sizes, with 7B, 13B and 70B parameters.
  • Llama 2, the base model pre-trained on 2 trillion tokens, 40% more than the original LLaMA which was already trained far longer than most models.
  • Llama 2-Chat, a model fine tuned by Meta for chat use cases. The Llama 2 paper describes the multi-month effort that went into fine tuning, evaluating and refining the model to deliver a best in class chat experience. Llama 2-Chat was fine tuned with a mix of techniques including instruction tuning using over 100,000 curated annotations, and reinforcement learning with human feedback (RLHF) using over a million data points. The paper also includes human evaluation results of Llama 2-Chat by the team, showing the Llama 2-Chat consistently outperforming other open LLMs in the comparison.
Details of the Llama 2 training dataset, like the specific data sources and references used, have not been shared by Meta (unlike the original LLaMA). The release publication does provide the following guidance around the data - the training data includes a mix of public data sources, it does not include data from Meta’s products or services (like Facebook), and it attempts to remove sites which are known to contain high volume of personal information. The paper does share the compute hardware used for training and the total compute power needed to build these models. The pretraining of the Llama 2 70B model uses 1,720,320 hours of NVIDIA A100 GPUs, which is millions of dollars if converted to cloud computing costs, and is a helpful reminder of the cost of building and training new foundation models.

Open source LLMs on OctoAI

OctoAI is committed to bringing to developers the latest and most useful foundation models. We hear about innovative applications and experiences built on open LLMs every day, and are continuously adding to the library of models available on OctoAI. Some of the recent LLM additions to the list include the LLaMA 65B, Vicuna, and Falcon. We also recently added the OctoAI Endpoint integration in LangChain, making it easy for developers to build LangChain applications that use the latest open source LLMs on OctoAI.

Llama 2 marks a new milestone in the world of open LLMs, and we are actively working to evaluate and add Llama 2 models to OctoAI. The Llama 2-Chat 7B is available today on OctoAI. We will be adding the 13B and 70B parameter models soon alongside the base models for non interactive workloads.

Try Llama 2-Chat on OctoAI today

You can start testing and building on Llama 2-Chat today with a free trial on OctoAI. You’re also welcome to join us on our Discord server to engage with the team and our community.

We’d love for you to try out the new Llama 2-Chat on OctoAI and let us know what you think!

Your choice of models on our SaaS or in your environment

Run any model or checkpoint on our efficient, reliable, and customizable API endpoints. Sign up and start building in minutes.

Sign Up Today