Discover AI models for every task
Showing 1-10 of 10 models
by NousResearch
NousResearch Hermes 4 70B is a frontier hybrid-mode reasoning model based on Llama-3.1-70B, trained on ~60B tokens across ~5M samples. Features explicit <think> deliberation segments with massive improvements in math, code, STEM, logic, and creativity. Achieves SOTA on RefusalBench for helpful, uncensored responses while maintaining alignment to user values. Supports schema-adherent structured JSON outputs, function calling, and tool use. Trained for extreme steerability with reduced refusal rates compared to previous Hermes versions.
NousResearch Hermes 4 405B is the flagship hybrid-mode reasoning model based on Meta's Llama-3.1-405B architecture. Trained on a massive ~60B token corpus with explicit <think> deliberation segments, it delivers frontier-level performance in math, code, STEM, logic, and creative tasks. Achieves SOTA on RefusalBench for helpful, uncensored responses aligned to user values. Supports advanced function calling, structured JSON outputs, and tool use with extreme steerability and reduced refusal rates.
by Meta
Meta Llama 3.3 70B Instruct is a multilingual instruction-tuned model optimized for dialogue. Trained on ~15 trillion tokens with cutoff December 2023, it outperforms many open-source and closed models. Major improvements include 92.1% on IFEval (steerability), 88.4% on HumanEval (code), 77.0% on MATH, and 91.1% on MGSM (multilingual). Features 128K context, Grouped-Query Attention, and supports 8 languages including English, German, French, Spanish, Italian, Portuguese, Hindi, and Thai. Trained on 7M GPU hours with 100% renewable energy.
by ZAI
GLM-4.7 is Z.ai's latest large language model with enhanced reasoning capabilities. Excels at mathematical problem solving, coding, and complex logical tasks. Features improved context understanding and multilingual support.
by Google
Google Gemma 4 31B is a 31B parameter dense multimodal language model with a 256K context window. It processes text, images, and video inputs and generates text output, featuring a configurable thinking mode for step‑by‑step reasoning. The model achieves 85.2% on MMLU Pro, 80.0% on LiveCodeBench v6, and 88.4% on MMMLU, demonstrating strong performance across reasoning and multimodal benchmarks. Available under the Apache 2.0 license.
GLM-4.5 Air is a compact 106B parameter Mixture-of-Experts model with 12B active parameters, optimized for efficiency while maintaining strong performance. Scores 59.8 across 12 industry benchmarks with superior resource efficiency compared to full GLM-4.5. Features hybrid reasoning mode with 128K context, supports intelligent agent functions and tool calling. Released under MIT license with commercial use allowed. Ideal for deployment scenarios requiring balance between capability and computational cost.
GLM-4.5 is a 355B parameter Mixture-of-Experts foundation model with 32B active parameters, designed for intelligent agents. Features hybrid reasoning mode with configurable thinking enabled by default. Ranks 3rd place at 63.2 across 12 industry benchmarks among all proprietary and open-source models. Released under MIT license with 128K context, supports reasoning, coding, and intelligent agent functions including OpenAI-style tool calling. Incorporates MTP (Multi-Token Prediction) layers with speculative decoding for efficient inference.
GLM-4.6 is a frontier-scale 355B parameter Mixture-of-Experts model with a 200K context window and 128K output capability. MIT licensed, making it the only model in its class that enterprises can self-host and deeply customize. Dominates LiveCodeBench v6 (#1, 82.8%), HLE (#1), excels at AIME 2025 (#3, 93.9%) and Terminal-Bench (#3, 40.5%). Near parity with Claude Sonnet 4 (48.6% win rate) while dramatically outperforming other open-source baselines. Purpose-built for agentic workflows, real-world coding, and tool-augmented problem-solving. Supports native tool calling during inference for complex multi-step tasks.
by Mistral
Mistral Small 4 is a 119B-parameter Mixture-of-Experts model (128 experts, 4 active per token, 6.5B active parameters) that unifies instruct, reasoning, and coding capabilities into a single multimodal model. It accepts text and image inputs, supports function calling, structured outputs, and configurable reasoning effort (none for fast responses, high for deep step-by-step reasoning). With a 256K context window and Apache 2.0 license, it delivers 40% lower latency and 3x higher throughput compared to Mistral Small 3.
by Qwen
Qwen3 Coder 480B A35B Instruct is a specialized Mixture-of-Experts coding model with 480B total parameters and 35B activated. Optimized specifically for code generation, code understanding, debugging, and software engineering tasks. Features 262K native context for handling large codebases, strong performance on coding benchmarks including LiveCodeBench and HumanEval, and support for multiple programming languages. Excels at complex algorithmic problems, code refactoring, and technical documentation generation.