DeepSeek-R1 is the latest AI model of DeepSeek, a Chinese organization. It is an open model with state-of-the-art reasoning capabilities. Instead of giving a direct response to the users’ queries, it thinks through the problems, applies logic, and then generates the best response. Performing this sequence of thinking processes before generating the best response is called test-time scaling. DeepSeek-R1 is capable of doing test-time scaling. These capabilities of DeepSeek result in an increase in its demand in the market. Previously, Microsoft announced bringing DeepSeek-R1 to Azure AI and GitHub. Now, NVIDIA is going to integrate it with NIM. DeepSeek-R1 is now available with NVIDIA NIM integration.
![DeepSeek-R1 available with NVIDIA NIM Integration](https://aitipsguide.com/wp-content/uploads/2025/02/DeepSeek-R1-available-with-NVIDIA-NIM-Integration.png)
DeepSeek-R1 is now available with NVIDIA NIM Integration
DeepSeek-R1 is capable of delivering high inference efficiency and high accuracy in tasks demanding logical inference, reasoning, math, coding, and language understanding. R1 incorporates an impressive 671 billion parameters, which is 10 times more than many other popular Large Language Models. Moreover, it also supports a large input context length of 128000 tokens. That’s why NVIDIA is bringing DeepSeek-R1 to NIM Microservices.
NVIDIA NIM is a set of accelerated inference microservices that allow organizations to run AI models on NVIDIA GPUs anywhere.
The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.
NVIDIA has made the 671 billion parameter DeepSeek-R1 available as NVIDIA NIM Microservice Preview on build.nvidia.com to help developers experiment and build their own specialized agents. According to NVIDIA, the DeepSeek NVIDIA NIM Microservice can deliver up to 3872 tokens per second on computer systems with a single NVIDIA HGX H200.
NVIDIA also disclosed that the upcoming GPUs will be powered by the next-generation NVIDIA Blackwell architecture and will be able to deliver up to 20 petaflops of peak FP4 compute performance and a 72-GPU NVLink domain specifically optimized for inference.
You can read the complete information on blog.nvidia.com.
Leave a Reply