| NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World’s Enterprises

Enterprises and Nations Can Now Build ‘Supermodels’ With NVIDIA AI Foundry Using Their Own Data Paired With Llama 3.1 405B and NVIDIA Nemotron Models
NVIDIA AI Foundry Offers Comprehensive Generative AI Model Service Spanning Curation, Synthetic Data Generation, Fine-Tuning, Retrieval, Guardrails and Evaluation to Deploy Custom Llama 3.1 NVIDIA NIM Microservices With New NVIDIA NeMo Retriever Microservices for Accurate Responses
Accenture First to Use New Service to Build Custom Llama 3.1 Models for Clients; AT&T, Uber and Other Industry Leaders Among First to Access New Llama NVIDIA NIM Microservices

NVIDIA today announced a new NVIDIA AI Foundry service and NVIDIA NIM™ inference microservices to supercharge generative AI for the world’s enterprises with the Llama 3.1 collection of openly available models, also introduced today.

With NVIDIA AI Foundry, enterprises and nations can now create custom “supermodels” for their domain-specific industry use cases using Llama 3.1 and NVIDIA software, computing and expertise. Enterprises can train these supermodels with proprietary data as well as synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron™ Reward model.

NVIDIA AI Foundry is powered by the NVIDIA DGX™ Cloud AI platform, which is co-engineered with the world’s leading public clouds, to give enterprises significant compute resources that easily scale as AI demands change.

The new offerings come at a time when enterprises, as well as nations developing sovereign AI strategies, want to build custom large language models with domain-specific knowledge for generative AI applications that reflect their unique business or culture.

“Meta’s openly available Llama 3.1 models mark a pivotal moment for the adoption of generative AI within the world’s enterprises,” said Jensen Huang, founder and CEO of NVIDIA. “Llama 3.1 opens the floodgates for every enterprise and industry to build state-of-the-art generative AI applications. NVIDIA AI Foundry has integrated Llama 3.1 throughout and is ready to help enterprises build and deploy custom Llama supermodels.”

“The new Llama 3.1 models are a super-important step for open source AI,” said Mark Zuckerberg, founder and CEO at Meta. “With NVIDIA AI Foundry, companies can easily create and customize the state-of-the-art AI services people want and deploy them with NIVIDIA NIM. I’m excited to get this in people’s hands.”

To supercharge enterprise deployments of Llama 3.1 models for production AI, NVIDIA NIM inference microservices for Llama 3.1 models are now available for download from ai.nvidia.com. NIM microservices are the fastest way to deploy Llama 3.1 models in production and power up to 2.5x higher throughput than running inference without NIM.

Enterprises can pair Llama 3.1 NIM microservices with new NVIDIA NeMo Retriever NIM microservices to create state-of-the-art retrieval pipelines for AI copilots, assistants and digital human avatars.

Accenture Pioneers Custom Llama Supermodels for Enterprises With AI Foundry

Global professional services firm Accenture is first to adopt NVIDIA AI Foundry to build custom Llama 3.1 models using the Accenture AI Refinery™ framework, both for its own use as well as for clients seeking to deploy generative AI applications that reflect their culture, languages and industries.

“The world’s leading enterprises see how generative AI is transforming every industry and are eager to deploy applications powered by custom models,” said Julie Sweet, chair and CEO of Accenture. “Accenture has been working with NVIDIA NIM inference microservices for our internal AI applications, and now, using NVIDIA AI Foundry, we can help clients quickly create and deploy custom Llama 3.1 models to power transformative AI applications for their own business priorities.”

NVIDIA AI Foundry provides an end-to-end service for quickly building custom supermodels. It combines NVIDIA software, infrastructure and expertise with open community models, technology and support from the NVIDIA AI ecosystem.

With NVIDIA AI Foundry, enterprises can create custom models using Llama 3.1 models and the NVIDIA NeMo platform — including the NVIDIA Nemotron-4 340B Reward model, ranked first on the Hugging Face RewardBench.

Once custom models are created, enterprises can create NVIDIA NIM inference microservices to run them in production using their preferred MLOps and AIOps platforms on their preferred cloud platforms and NVIDIA-Certified Systems™ from global server manufacturers.

NVIDIA AI Enterprise experts and global system integrator partners work with AI Foundry customers to accelerate the entire process, from development to deployment.

NVIDIA Nemotron Powers Advanced Model Customization

Enterprises that need additional training data for creating a domain-specific model can use Llama 3.1 405B and Nemotron-4 340B together to generate synthetic data to boost model accuracy when creating custom Llama supermodels.

Customers that have their own training data can customize Llama 3.1 models with NVIDIA NeMo for domain-adaptive pretraining, or DAPT, to further increase model accuracy.

NVIDIA and Meta have also teamed to provide a distillation recipe for Llama 3.1 that developers can use to build smaller custom Llama 3.1 models for generative AI applications. This enables enterprises to run Llama-powered AI applications on a broader range of accelerated infrastructure, such as AI workstations and laptops.

Industry-Leading Enterprises Supercharge AI With NVIDIA and Llama

Companies across healthcare, financial services, retail, transportation and telecommunications are already working with NVIDIA NIM microservices for Llama. Among the first to access the new NIM microservices for Llama 3.1 are AT&T, Uber and other industry leaders.

Trained on over 16,000 NVIDIA H100 Tensor Core GPUs and optimized for NVIDIA accelerated computing and software — in the data center, in the cloud, and locally on workstations with NVIDIA RTX™ GPUs or PCs with GeForce RTX GPUs — the Llama 3.1 collection of multilingual LLMs is a collection of generative AI models in 8B-, 70B- and 405B-parameter sizes.

New NeMo Retriever RAG Microservices Boost Accuracy and Performance

Using new NVIDIA NeMo Retriever NIM inference microservices for retrieval-augmented generation (RAG), organizations can enhance response accuracy when deploying customized Llama supermodels and Llama NIM microservices in production.

Combined with NVIDIA NIM inference microservices for Llama 3.1 405B, NeMo Retriever NIM microservices deliver the highest open and commercial text Q&A retrieval accuracy for RAG pipelines.

Enterprise Ecosystem Ready to Power Llama 3.1 and NeMo Retriever NIM Deployments

Hundreds of NVIDIA NIM partners providing enterprise, data and infrastructure platforms can now integrate the new microservices in their AI solutions to supercharge generative AI for the NVIDIA community of more than 5 million developers and 19,000 startups.

Production support for Llama 3.1 NIM and NeMo Retriever NIM microservices is available through NVIDIA AI Enterprise. Members of the NVIDIA Developer Program will soon be able to access NIM microservices for free for research, development and testing on their preferred infrastructure.

Credit: NVIDIA

NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World’s Enterprises

Mouser Electronics Announces Partnership with DS PENSKE for Formula E Season 10, Kicking off in Mexico City

BMW announces new long-term partnership with the Städel Museum. Artist Marc Brandenburg has created an exclusive design for a BMW iX1.

BMW Exploro bikes: new special edition combines innovative technology with progressive design.

Artificial intelligence and big data can help preserve wildlife

Sweat-proof “smart skin” takes reliable vitals, even during workouts and spicy meals

AirCar’s flying car completes first ever inter-city flight

Communiqué – Energy Observer lights up the Eiffel Tower using zero-emission hydrogen within the context of the “Paris de l’hydrogène” event

Tower Semiconductor and Innolight Expand their Collaboration and Ramp Volume of Next-Generation SiPho Solutions for AI and Data Centers

Lenovo Just Launched the World’s First Laptop with an Under-Display Camera: Here’s How They Did It

Power Supplies 10 to 50W industrial power supply series expanded with new mounting and protection options for increased system flexibility

Nordic Semiconductor collaborates with Deutsche Telekom to make everything cellular connected

ROHM’s EcoGaN™ has been Adopted for AI Server Power Supplies by Murata Power Solutions

Digi-Key and Silicon Labs Announce “Your IoT” Design Contest Winners

Microsemi Introduces XMC Form Factor SATA SSD for Industrial and Defense Applications

Mouser Electronics’ Robotics Innovation Video Wins Telly Award

Mouser-Sponsored Rebellion Racing Team Looks to Collect a Win at Famed Nürburgring

New Honeywell proximity sensors are rugged and reliable in extreme environments – now from TTI, Inc.

Share This Story

Related Posts