Introducing Vicuna-13B: An Open-Source Chatbot That Matches GPT-4 Quality at 90% By: The Vicuna Team, March 30, 2023
We’re proud to present Vicuna-13B, an open-source chatbot meticulously trained by fine-tuning LLaMA using user-shared conversations sourced from ShareGPT. Our preliminary evaluations, conducted with GPT-4 as the benchmark, reveal that Vicuna-13B attains a remarkable quality level of over 90%, on par with OpenAI ChatGPT and Google Bard. Notably, it surpasses other models like LLaMA and Stanford Alpaca in more than 90% of cases. The development cost of Vicuna-13B stands at approximately $300. We’ve made the code, model weights, and an online demo publicly accessible for non-commercial use.
Discover Vicuna AI, the revolutionary open-source chatbot meticulously crafted by the Vicuna Team. This advanced chatbot has been expertly honed using real user conversations sourced from ShareGPT. Initial assessments reveal Vicuna-13B’s exceptional performance, surpassing OpenAI ChatGPT and Google Bard with a remarkable accuracy rate of over 90%. In fact, it outperforms models like LLaMA and Stanford Alpaca in more than 90% of scenarios.
Key Features of Vicuna AI
- Enhanced Performance: Vicuna AI has undergone intensive fine-tuning with 70,000 user-shared ChatGPT conversations. As a result, it excels in generating highly detailed and well-structured responses, far superior to Alpaca.
- Innovative Evaluation System: The Vicuna Team has introduced a state-of-the-art automated evaluation framework. This system benchmarks performance and generates assessments based on GPT-4, ensuring top-notch accuracy and reliability.
- Efficient Serving System: Vicuna AI boasts a streamlined distributed serving system. This lightweight architecture can seamlessly handle multiple models, thanks to its distributed workers.
Versatile Applications of Vicuna AI
Vicuna-13B serves as an exemplary chatbot system, proficient in generating comprehensive and well-structured responses. It also acts as an open canvas for future research endeavors. While it may have limitations in tasks involving reasoning or mathematics, and may face challenges in accurately identifying itself or ensuring factual accuracy, Vicuna AI paves the way for tackling these limitations head-on.
Limitations to Consider
It’s essential to acknowledge that Vicuna AI, like other large language models, does have its limitations. Tasks requiring intricate reasoning or mathematics might pose challenges. Additionally, there might be instances where it struggles with accurate self-identification or ensuring absolute factual precision. Safety, toxicity, and bias mitigation are areas that continue to be optimized.
Release and Licensing Information
Excitingly, the Vicuna AI project is open to the public. The training, serving, and evaluation code are readily available on GitHub. Furthermore, the Vicuna-13B model weights have been made accessible to the community. While the online demo offers a sneak peek into the future of conversational AI, please note that it is a research preview meant solely for non-commercial use. The code is released under the Apache License 2.0, promoting collaboration and innovation within the community.
FAQs
Q1: What is Vicuna AI 2024?
A1: Vicuna AI 2024 is an innovative open-source chatbot developed by the Vicuna Team. It has been fine-tuned using real user conversations from ShareGPT and showcases exceptional performance, outperforming other prominent models in over 90% of cases.
Q2: What sets Vicuna AI apart from other chatbot models?
A2: Vicuna AI stands out due to its enhanced performance, generating detailed and well-structured responses after fine-tuning with 70,000 user-shared ChatGPT conversations. It also features an advanced evaluation framework based on GPT-4 and an efficient distributed serving system.
Q3: What are the use cases of Vicuna AI?
A3: Vicuna AI, specifically Vicuna-13B, serves as a powerful chatbot system capable of generating comprehensive and well-structured responses. It also acts as a starting point for future research, addressing limitations in tasks involving reasoning and mathematics, and ensuring accurate self-identification or factual accuracy.
Q4: What limitations does Vicuna AI have?
A4: Like other large language models, Vicuna AI has limitations. It may face challenges in tasks involving reasoning, mathematics, accurate self-identification, and ensuring absolute factual precision. Efforts are ongoing to optimize safety, mitigate toxicity, and address biases.
Q5: How can developers access Vicuna AI?
A5: Developers can access Vicuna AI through its open-source GitHub repository. The training, serving, and evaluation code, along with the Vicuna-13B model weights, are available to the public. An online demo is also provided for research purposes, under the Apache License 2.0 for non-commercial use.
Pricing Vicuna AI 2024
- FREE | Open Source | Discord Community
Table 1. Comparison between several notable models
Model Name | LLaMA | Alpaca | Vicuna | Bard/ChatGPT |
Dataset | Publicly available datasets (1T token) | Self-instruct from davinci-003 API (52K samples) | User-shared conversations (70K samples) | N/A |
Training code | N/A | Available | Available | N/A |
Evaluation metrics | Academic benchmark | Author evaluation | GPT-4 assessment | Mixed |
Training cost (7B) | 82K GPU-hours | $500 (data) + $100 (training) | $140 (training) | N/A |
Training cost (13B) | 135K GPU-hours | N/A | $300 (training) | N/A |
Table 2. Total Scores Assessed by GPT-4.
Baseline | Baseline Score | Vicuna Score |
LLaMA-13B | 513.0 | 694.0 |
Alpaca-13B | 583.0 | 704.0 |
Bard | 664.0 | 655.5 |
ChatGPT | 693.0 | 638.0 |
Vicuna, our cutting-edge open-source chatbot developed through collaborative efforts from UC Berkeley, CMU, Stanford, UC San Diego, and MBZUAI, represents a significant leap in conversational AI. However, akin to other extensive language models, Vicuna does come with limitations. Notably, it struggles with tasks involving reasoning, mathematics, accurate self-identification, and ensuring the complete factual precision of its outputs. While we’ve incorporated the OpenAI moderation API to filter inappropriate user inputs in our online demo, addressing these challenges remains a priority. Despite these limitations, Vicuna serves as a promising foundation for future research initiatives aimed at overcoming these hurdles.
Vicuna’s Performance: Following the fine-tuning process with 70,000 user-shared ChatGPT conversations, Vicuna demonstrates its ability to produce highly detailed and well-structured responses. In fact, Vicuna’s responses are comparable in quality to ChatGPT, surpassing Alpaca and showcasing a superior level of detail and structure (refer to examples below).
Limitations:
Challenges | Mitigation Efforts |
---|---|
Tasks involving reasoning and mathematics | Ongoing research to enhance capabilities and accuracy |
Accurate self-identification and factual precision | Continuous optimization and refinement processes |
Safety concerns, toxicity, and bias mitigation | Implementation of the OpenAI moderation API in the online demo |
Release Information:
In our initial release, we are sharing the training, serving, and evaluation code through our GitHub repository: https://github.com/lm-sys/FastChat. Additionally, we’ve made the Vicuna-13B model weights accessible to the public. While the dataset won’t be released, we invite you to join our Discord server and follow our Twitter accounts for the latest updates.
License Details:
The online demo serves as a research preview, exclusively intended for non-commercial use. Users are bound by the model License of LLaMA, OpenAI’s Terms of Use concerning data generated, and ShareGPT’s Privacy Practices. If you suspect any violations, kindly contact us for prompt resolution. The code is released under the Apache License 2.0, fostering collaboration and innovation within the community.
Acknowledgment:
We extend our gratitude to our collaborators from BAIR, Stanford Alpaca team, and MBZUAI for their invaluable discussions and feedback. For further inquiries, please feel free to contact Lianmin Zheng (lianminzheng@gmail.com), Hao Zhang (sjtu.haozhang@gmail.com), or LMSYS (lmsys.org@gmail.com).