Open Source LLMs
This page discusses the open source (open weights) models which can be used with Ollama, LM Studio and similar tools. For proprietary, closed source models, see: Online services
Some Large Language Models are more optimized for generic chat, others are coding-oriented.
Model libraries:
Tools for limited GPU resources:
- LLM RAM Calculator
- GPU-Poor LLM Arena (models with maximum 14B parameters)
TODO: Update the list of best models
Models for generic chat / instruct (=instruction following)
- Mistral-7B-Instruct-v0.2
- Dolphin-Mistral - The uncensored Dolphin model based on Mistral that excels at coding tasks. Fine-tune of Mistral
- WizardLM-2 - Fine-tune of Mistral
- Llama 3 by Meta <– Top
- DBRX by Databricks
Optimized for Coding
Note: As of 2025, general-purpose (proprietary) models are also excellent for coding. Check Aider LLM Leaderboards.
- Code Llama by Meta
- CodeGemma by Google
- DeepSeek-Coder-V2 - Ollama
- Qwen2.5-Coder (formerly known as CodeQwen) - context: 128K - Ollama <– Top
Optimized for long context / RAG
- Command-R by Cohere (also good for tool use)
Optimized for JSON output / function calling / tool use
C4AI Command R+ (2024-03-20, CC-BY-NC, Cohere) is a 104B parameter multilingual model with advanced Retrieval Augmented Generation (RAG) and tool use capabilities, optimized for reasoning, summarization, and question answering across 10 languages. Supports quantization for efficient use and demonstrates unique multi-step tool integration for complex task execution.
Hermes 2 Pro - Mistral 7B (2024-03-13, Nous Research) is a 7B parameter model that excels at function calling, JSON structured outputs, and general tasks. Trained on an updated OpenHermes 2.5 Dataset and a new function calling dataset, it uses a special system prompt and multi-turn structure. Achieves 91% on function calling and 84% on JSON mode evaluations.
Gorilla OpenFunctions v2 (2024-02-27, Apache 2.0 license, Charlie Cheng-Jie Ji et al.) interprets and executes functions based on JSON Schema Objects, supporting multiple languages and detecting function relevance.
NexusRaven-V2 (2023-12-05, Nexusflow) is a 13B model outperforming GPT-4 in zero-shot function calling by up to 7%, enabling effective use of software tools. Further instruction-tuned on CodeLlama-13B-instruct.
Functionary (2023-08-04, MeetKai) interprets and executes functions based on JSON Schema Objects, supporting various compute requirements and call types. Compatible with OpenAI-python and llama-cpp-python for efficient function execution in JSON generation tasks.
Source: awesome-llm-json
Vision models (multimodal)
- LLaVA - details
- Llama 3.2 by Meta
- Pixtral 12B by Mistral - https://youtu.be/7aGTKJJMb5w