NLP Monday
5th June 2023
Happy Monday, everyone! I'm thrilled to present to you the very first edition of our newsletter. Get ready for an abundance of exciting news in this edition!
Unleash the Power of Large Language Models on Compact Hardware
From ChatGPT: By ChatGPT, For ChatGPT - Harnessing ChatGPT for Generating Chatbot Training Data
A Reality Check for Your Alpaca Models - Evaluating the Accuracy of Open Source Language Models in Emulating ChatGPT
Unleash the Power of Large Language Models on Compact Hardware
Now, you have the capability to train large language models on your own, leveraging commercial GPUs. QLoRA: Efficient Finetuning of Quantized LLLMs presents a groundbreaking advancement by introducing NF4, a 4-bit Normal Float floating point type, along with various memory optimizations. With these innovations, it becomes possible to train a 65B parameter model on a 48GB GPU within a single day, without any compromise in performance (achieving an impressive 99% on the Vicuna benchmark, a renowned chatbot evaluation framework). On a modern GPU, you can now embark on training and testing these models. Furthermore, the creators have seamlessly integrated this technology with Huggingface and generously shared multiple open-source models. For access to their GitHub code, simply follow this link https://github.com/artidoro/qlora
From ChatGPT: By ChatGPT, For ChatGPT - Harnessing ChatGPT for Generating Chatbot Training Data
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
In a study, researchers harnessed the power of ChatGPT to automatically generate an exceptional chat corpus. By engaging ChatGPT to produce conversational messages centred around various topics, they were able to create a rich dataset. Notably, to address the medical domain, the authors thoughtfully incorporated dialogues specifically tailored to healthcare. Furthermore, they developed their own model to discern between good and subpar responses, bypassing the complexities of Reinforcement Learning. The exciting aspect of this research is that you can employ ChatGPT to train a smaller, yet highly capable model.
To experience their impressive demo firsthand, visit this link: https://huggingface.co/spaces/project-baize/chat-with-baize
A Reality Check for Your Alpaca Models - Evaluating the Accuracy of Open Source Language Models in Emulating ChatGPT
There has been a recent surge in the development of smaller language models aiming to imitate ChatGPT. One notable example is Alpaca, which represents the first of such models. These models are trained using a large language model on a dataset consisting of instructions like "Provide me with three productivity tips" and their corresponding answers. However, a new paper titled "The False Promise of Imitating Proprietary" investigates whether training these models can bridge the gap to proprietary models like ChatGPT. The answer, unfortunately, is NO. According to the paper, models like Alpaca excel in imitating the style but struggle when it comes to factual accuracy. In fact, the study demonstrates that training such imitation models can even have a detrimental impact on downstream performance in certain tasks. It is crucial to exercise caution when attempting to imitate proprietary models in the future and thoroughly test them before deployment
Cool tools
lmstudio.ai - The power of chatbots from the comfort of your macbook. You can download and run language models on your macbook.
https://github.com/OpenBMB/ToolBench - If you want your language models to learn how to use tools, then we need datasets. This repository is helping you collect the dataset and provide you code to train such language models

