DeepSeek R1 Distill Llama 3.3 70BDeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1.
DeepSeek R1 671BPerformance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Login Required
DeepSeek V3-0324State of the art Mixture-of-Experts model with coding performance on par with Claude Sonnet 3.7 while being fast and cost-effective.
Login Required
DeepSeek R1 0528DeepSeek R1 0528 is an FP8-quantized model based on the deepseek_v3 architecture, designed for complex computations and demanding workloads. It has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro.
Apriel 5B InstructThe Apriel model family offers high efficiency across diverse tasks through a multi-stage training process that includes continual pretraining, supervised finetuning, and alignment techniques.
Hermes 3 Llama 3.1 405B FP8Hermes 3 is a generalist finetune of the Llama-3.1 70B foundation model, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
Llama 3.1 Nemotron 70B InstructNvidia's latest Llama fine-tune, topping alignment benchmarks and optimized for instruction following.
Liquid AI 40BLFM-40B offers a new balance between model size and output quality. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
Llama 3.3 70BThe Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
Login Required
Llama 4 Maverick 17BMeta's most intelligent multimodal open-source model in its class. Features 17B active parameters with 128 experts (400B total), beating GPT-4o and Gemini 2.0 Flash on coding, reasoning, and image understanding benchmarks. First Llama model with native multimodality and mixture-of-experts architecture.
Login Required
Llama 4 Scout 17BMeta's efficient multimodal model with industry-leading 10M token context window. Features 17B active parameters with 16 experts (109B total), optimized for long-context tasks like multi-document analysis, extensive codebase reasoning, and personalized AI experiences.
Qwen 2.5 Coder 32B InstructQwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities.
Qwen3-32B-FP8Qwen3-32B-FP8 is part of Alibaba's latest generation of hybrid reasoning models featuring seamless switching between thinking and non-thinking modes. With 32.8B parameters and FP8 quantization, it offers superior reasoning capabilities for mathematics, coding, and logical tasks while maintaining cost efficiency. The model supports 119 languages and dialects with MCP support for advanced agent workflows.