DeepSeek R1 Distill Llama 3.3 70BDeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1.
DeepSeek R1 671bPerformance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Hermes-3-Llama-3.1-405B-FP8Hermes 3 is a generalist finetune of the Llama-3.1 70B foundation model, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
Llama-3.1-Nemotron-70b-instructNvidia's latest Llama fine-tune, topping alignment benchmarks and optimized for instruction following.
Liquid-AI-40BLFM-40B offers a new balance between model size and output quality. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
Llama 3.3 70BThe Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks.
Qwen25-Coder-32B-InstructQwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities.