Open Source LLMs

This page discusses the open source (open weights) models which can be used with Ollama, LM Studio and similar tools. For proprietary, closed source models, see: Online services

Some Large Language Models are more optimized for generic chat, others are coding-oriented.

Model libraries:

Tools for limited GPU resources:

LLM RAM Calculator
GPU-Poor LLM Arena (models with maximum 14B parameters)

TODO: Update the list of best models

Models for generic chat / instruct (=instruction following)

Mistral-7B-Instruct-v0.2
Dolphin-Mistral - The uncensored Dolphin model based on Mistral that excels at coding tasks. Fine-tune of Mistral
WizardLM-2 - Fine-tune of Mistral
Llama 3 by Meta <– Top
DBRX by Databricks

Optimized for Coding

Note: As of 2025, general-purpose (proprietary) models are also excellent for coding. Check Aider LLM Leaderboards.

Code Llama by Meta
CodeGemma by Google
DeepSeek-Coder-V2 - Ollama
Qwen2.5-Coder (formerly known as CodeQwen) - context: 128K - Ollama <– Top

Optimized for long context / RAG

Command-R by Cohere (also good for tool use)

Optimized for JSON output / function calling / tool use

C4AI Command R+ (2024-03-20, CC-BY-NC, Cohere) is a 104B parameter multilingual model with advanced Retrieval Augmented Generation (RAG) and tool use capabilities, optimized for reasoning, summarization, and question answering across 10 languages. Supports quantization for efficient use and demonstrates unique multi-step tool integration for complex task execution.
Hermes 2 Pro - Mistral 7B (2024-03-13, Nous Research) is a 7B parameter model that excels at function calling, JSON structured outputs, and general tasks. Trained on an updated OpenHermes 2.5 Dataset and a new function calling dataset, it uses a special system prompt and multi-turn structure. Achieves 91% on function calling and 84% on JSON mode evaluations.
Gorilla OpenFunctions v2 (2024-02-27, Apache 2.0 license, Charlie Cheng-Jie Ji et al.) interprets and executes functions based on JSON Schema Objects, supporting multiple languages and detecting function relevance.
NexusRaven-V2 (2023-12-05, Nexusflow) is a 13B model outperforming GPT-4 in zero-shot function calling by up to 7%, enabling effective use of software tools. Further instruction-tuned on CodeLlama-13B-instruct.
Functionary (2023-08-04, MeetKai) interprets and executes functions based on JSON Schema Objects, supporting various compute requirements and call types. Compatible with OpenAI-python and llama-cpp-python for efficient function execution in JSON generation tasks.

Source: awesome-llm-json

Vision models (multimodal)

LLaVA - details
Llama 3.2 by Meta
Pixtral 12B by Mistral - https://youtu.be/7aGTKJJMb5w