#6 : Open Source LLMs : You Need to Know 💥

Important trends in LLM's . The LLama 2 and list of all the best Open Sourced LLM's !

Aug 07, 2023

Welcome to the episode 6 of AI Tribe 1-1-1 : A Biweekly newsletter designed to spark your interest in AI tools, concepts, applications and research !

This episode is all about different Open source LLM’s out there today and their benefits compared to Proprietary models like ChatGPT . Also some important trends related to LLM’s is discussed . So let’s jump straight in !

⚙️ Tool - Llama 2 : Incredible Open-Source LLM

Llama 2 is a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7B to 70B parameters. The base model seems to perform better than GPT-3 and similar to GPT-3.5 (besides coding).

Llama 2 is free to download on your computer for research and also commercial use. This is huge! Every developer with a good idea can build a business around Llama 2.

You can also try it here : https://labs.perplexity.ai/

fig 1. superior writing abilities of LLMs, as manifested in surpassing human annotators in certain tasks, are fundamentally driven by RLHF.

😇 Today’s Recipe : All the best Open Source LLM’s you can use !

In recent months there has been a rapid development of new models, variants, and an increasing number of options that can rival ChatGPT in performance.

Proprietary models : GPT-4 , Claude

Open-Source models :

Commercial ( licence - Apache , MIT etc ) : Llama 2 , Falcon , GPT4all , Qwen from Alibaba, Dolly , Raven , H2O , StableLM
Non - commercial ( research only ) : Vicuna , Alpaca , Koala ,

The diagram below clearly shows GPT-4 has superior performance in Coding and Reasoning compared to GPT-3.5/Claude, while Vicuna-13B lags significantly behind in several specific categories: Extraction, Coding, and Math. Now the Llama 2 - 70B model roughly ties with GPT-3.5 ( not yet in coding ) , and performs noticeably stronger than Falcon, and Vicuna.

fig 2. The comparison of 6 representative LLMs regarding their abilities in 8 categories: Writing, Roleplay, Reasoning, Math, Coding, Extraction, STEM, Humanities.

However, the emergence of these open source models is just the beginning, and I am optimistic about the future of open source LLMs including Llama-2 which will improve significantly thanks to its open weights.

Which model is good? How much data should be used to achieve reasonable performance?

Google and OpenAI documented several studies indicating that large language models with 10s of billions of parameters started to exhibit capabilities comparable to human performance.

In the graph, we can see the initial models with only 1.3 billion parameters, which did not perform well in terms of precision represented on the vertical axis. Regardless of the context or examples provided, the maximum performance achieved was around 5%.

However, when we move from 1 billion to 10 billion parameters, the situation changes significantly. Initially, with a few examples, the performance still falls short, but as we apply fused prompting techniques, these chat bots improve their performance. The maximum performance achieved in these scenarios is around 25%.

However, when we make a qualitative leap from 10s of billions to hundreds of billions of parameters, the landscape changes considerably. Models can achieve performance exceeding 60%, comparable to human level performance. These models can learn a lot from a single example (One-shot prompting). Therefore, among the different options of open source large language models mentioned earlier, those with over 170 billion parameters can be expected to perform very well.

Hence, 1 criterion to consider when selecting an open source large language model is the number of parameters it has. It is also important to note that for Llama 2, Meta kept the Number of parameters constant but increased the size of the pretraining corpus by 40%, doubled the context length of the model (to 4k ) and introduced grouped query attention (Ainslie et al., 2023).

Additionally, there is a useful tool that provides insights into the current best performing open source large language models.

https://opencompass.org.cn/leaderboard-llm

https://lmsys.org/blog/2023-06-22-leaderboard/

In conclusion, we cannot definitively state that one model is better than others across all tasks. However, we can gain a general idea of which models perform better overall, and then it becomes a matter of testing each model in specific tasks and selecting the one that performs best for a particular use case.

🔗Article : Benefits of Open-source LLM Models

Although ChatGPT has made a significant impact in the world of technology, it is important to note that ChatGPT is not an open source model. This means that we cannot modify its source code as it is not publicly available, nor can we use it for free since there is a cost associated with using the model through its API.

In contrast to proprietary models like ChatGPT, there are open source models available. These models offer several advantages over proprietary models, such as data privacy, customization, affordability, and democratization.

Privacy

When it comes to data privacy, many companies prefer to have control over their data and are obligated to keep it secure without sharing it with third parties. Using ChatGPT can pose serious privacy issues because, as mentioned before, ChatGPT uses user data to send it to a server for retraining the model and performing analytics to allegedly improve the code.

Customization

Open source models allow developers to train the models with their own data and even add filters to have control over their data. This enables us to personalize the chat bot with our information and optimize it for specific use cases relevant to the company.

Affordability

Many companies might initially think that developing an open source large language model is more costly, since it requires investing in computational resources and specialized personnel. However, this is not entirely true, especially when there is intensive use of such models.

For a low range of daily requests, around thousands of requests per day, models like ChatGPT are more cost effective than open source large language models. However, for millions of requests per day, which are not a high number and easily achievable by a company, open source models become much more cost effective.

Democratization

These models, being openly accessible, provide opportunities for future research and development to address various artificial intelligence challenges.

Thank you for reading AI Tribe 1-1-1 . This post is public so feel free to share it.

Aug 7, 2023

Hey there. Great article on Open Source LLMs. As an AI enthusiast, I'm excited about the potential of models like Llama 2 and the benefits they offer compared to proprietary ones like ChatGPT. Looking forward to more AI Tribe 1-1-1 insights.

Expand full comment

Aug 8, 2023

Thank you Adam ! New LLMs are coming too. For instance Qwen from Alibaba seems to have beaten Llama 2 in few categories.

AI Tribe 1-1-1

#6 : Open Source LLMs : You Need to Know 💥

Important trends in LLM's . The LLama 2 and list of all the best Open Sourced LLM's !

⚙️ Tool - Llama 2 : Incredible Open-Source LLM

😇 Today’s Recipe : All the best Open Source LLM’s you can use !

🔗Article : Benefits of Open-source LLM Models

Discussion about this post