Nvidia’s Open Models to Provide Free Training Data for LLMs

Nvidia is expanding its substantive influence in the AI sphere with Nemotron-4 340B, a family of open models designed to generate synthetic LLM training data for commercial applications across numerous fields. Through what Nvidia is calling a “uniquely permissive” free open model license, Nemotron-4 340B provides a scalable way for developers to build LLMs. Synthetic data is artificially generated data designed to mimic the characteristics and structure of data found in the real world. The offering is being called “groundbreaking” and an important step toward the democratization of artificial intelligence.

“With a significant 9 trillion tokens used in training, a 4,000 context window, and support for over 50 natural languages and 40 programming languages, Nemotron-4 340B outshines its competitors, including Mistral’s Mixtral-8x22B, Anthropic’s Claude-Sonnet, Meta’s Llama 3-70B, [Alibaba’s] Qwen2, and even rivals the performance of [OpenAI’s] GPT-4,” writes VentureBeat, calling it “a significant milestone” whose potential impact “cannot be overstated.”

Of particular interest are the commercially-friendly license terms. “Nvidia’s commitment to making Nemotron-4 340B accessible to businesses is evident,” VentureBeat reports, adding that “this move is set to democratize AI, allowing companies of all sizes to harness the power of LLMs and create custom models tailored to their specific needs.”

Nemotron-4 340B can be downloaded now from Hugging Face, and developers will soon be able to access the models at ai.nvidia.com packaged as a Nvidia NIM microservice, Nvidia explains in a blog post. More details are available in a research paper.

Nvidia has also made available free the HelpSteer2 dataset, for alignment support, which VentureBeat notes has “propelled the Nemotron-4 340B Reward model to the top of the RewardBench leaderboard on Hugging Face” and further underscores the company’s “dedication to advancing the AI community as a whole.”

Nvidia says it has optimized the Nemotron-4 340B models for operation with its open-source NeMo and TensorRT-LLM tools, “facilitating efficient model training and deployment,” writes PYMNTS, explaining that “NeMo is a toolkit for building and training neural networks, while TensorRT-LLM is a runtime for optimizing and deploying LLMs.”

PYMNTS contextualizes the Nemotron-4 340B model breakthrough, citing industry analysts warning that “the demand for high-quality data, essential for powering artificial intelligence conversational tools like OpenAI’s ChatGPT, may soon outstrip supply and potentially stall AI progress.”

“Humanity can’t replenish that stock faster than LLM companies drain it,” said Jignesh Patel, computer science professor at Carnegie Mellon University.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.