Model Tag:
qwen
📋
Qwen 1.5 is a range of large language models developed by Alibaba Cloud, featuring parameter sizes from 0.5 billion to 110 billion.
License
Tongyi Qianwen RESEARCH LICENSE AGREEMENT
Last Updated
2024-05-04 (6 months ago)
Qwen 1.5 - An Overview of Alibaba Cloud's Latest Large Language Model
Qwen 1.5 represents a significant advancement in the series of large language models developed by Alibaba Cloud. These models utilize transformer architecture and are pre-trained on extensive datasets comprising web texts, books, code, and much more. With its diverse parameter offerings and improved functionalities, Qwen 1.5 is suited for various applications in natural language processing (NLP).
What's New in Qwen 1.5
Qwen 1.5 introduces six model sizes, providing increased flexibility for developers and researchers:
- 0.5B
- 1.8B
- 4B (default)
- 7B
- 14B
- 32B
- 72B
- 110B
Key Features of Qwen 1.5
- Enhanced Performance: Qwen 1.5 shows a significant boost in human preference for chat-based interactions, making it an excellent choice for customer service applications and interactive platforms.
- Multilingual Support: Both the base and chat models support multilingual capabilities, allowing them to cater to diverse user needs across different languages.
- Long Context Lengths: The model supports a stable context length of 32K for all sizes, allowing for more extensive input data and improving output quality.
- Memory Efficiency: With a minimum memory requirement of less than 2GB, deploying Qwen 1.5 is cost-effective, making it accessible for various users.
- Comprehensive Vocabulary: Qwen boasts a vocabulary exceeding 150K tokens, enhancing its effectiveness in handling multiple languages without the need for expansion.
- System Prompts for Flexibility: Users can leverage system prompts for role-playing, language style transfer, task setting, and behavior customization, providing unparalleled versatility.
Training and Performance
Qwen 1.5 has been trained on over 2.2 trillion tokens, encompassing a wide range of fields including general, professional, Chinese and English language texts, code, and mathematics. This extensive training allows it to outperform many existing open-source models across numerous evaluation tasks. Notably, Qwen 1.5 excels in common-sense reasoning, coding tasks, and even mathematics, often surpassing larger models in various benchmarks.