• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Wednesday, July 1, 2026
  • Login
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home Future Tech AI

Nvidia’s New AI Model is Ready to Rival GPT-4, Setting New Standards in AI

by Reshab Agarwal
October 3, 2024
in AI, News
Reading Time: 3 mins read
0
AI Models of Open AI and Anthropic will undergo testing before US rollouts
TwitterWhatsappLinkedin

Nvidia’s new AI model is ready to rival GPT-4 in both vision-language tasks and text-only performance. Nvidia has launched a groundbreaking open-source AI model that directly competes with proprietary systems from major tech players like OpenAI and Google. The company’s new NVLM 1.0 family of large multimodal language models, led by the powerful NVLM-D-72B, is designed to excel in both vision and language tasks, while also improving text-only performance.

You might also like

Microsoft Plans Fresh Round Of Layoffs Hitting Xbox, Sales And Consulting As AI Spending Surges Past $100 Billion

Hijacking the Cloud Cargo Thieves Target AI Data Center Supplies in Nationwide Heists

OpenAI Launches Codex Micro Keypad As Its First Hardware Product, Jony Ive’s Device Still Delayed

Nvidia’s latest NVLM 1.0 models aim to deliver cutting-edge results across various domains, especially in vision-language tasks. The company claims its new models rival top-tier proprietary systems, such as GPT-4. Nvidia’s move to publicly release model weights and commit to providing the training code signifies a shift in the industry, as most advanced AI models remain closed from public access. This decision gives developers and researchers an unprecedented opportunity to explore and innovate using high-performance AI systems.

NVLM-D-72B: Excelling in Visual and Textual Inputs

One of the standout features of NVLM-D-72B is its adaptability in handling both visual and textual inputs. The model’s capacity to interpret images, memes, and step-by-step math solutions sets it apart from other AI systems. Moreover, it improves its accuracy in text-based tasks after undergoing multimodal training, a challenge for many similar models. While other systems often see a drop in text performance after such training, NVLM-D-72B achieved a 4.3-point increase across text benchmarks.

Interestingly, by excelling in math, coding, and reasoning tasks, Nvidia’s new AI model is ready to rival GPT-4 in advanced multimodal capabilities.

The open-source release of NVLM 1.0 has sparked positive reactions from the AI community. One researcher noted that Nvidia’s NVLM-D-72B model performs similarly to other leading models, such as Llama 3.1 405B, in areas like math and coding, while also having strong capabilities in visual tasks.

Architectural Innovations and Industry Implications

NVLM 1.0 introduces a new architectural approach, blending various multimodal processing techniques. This hybrid method could influence future AI research and development. Nvidia’s decision to release such a model openly challenges the conventional business models of tech companies that keep their most advanced systems closed.

While this move opens doors for innovation, it also raises concerns about misuse and ethical implications. As powerful AI technology becomes more accessible, the need for responsible use and regulation grows.

Qualitative Capabilities of NVLM-D-72B

Nvidia’s new AI model is ready to rival GPT-4 by incorporating innovative architectural designs that boost efficiency. Nvidia’s NVLM-D-72B model showcases its versatility through a range of multimodal tasks, including optical character recognition (OCR), reasoning, localization, and world knowledge application. For instance, the model can understand complex visual humor, such as memes, by performing OCR to identify text and using reasoning to grasp the joke. In one example, NVLM-D-72B accurately interpreted the humor behind a meme comparing an “abstract” and a “paper” by analyzing visual cues and text.

The model also excels in answering location-sensitive questions, solving mathematical problems step-by-step, and generating detailed descriptions of images. These capabilities position NVLM-D-72B as a powerful tool for both visual and textual reasoning tasks.

Key Technical Highlights

Nvidia’s NVLM 1.0 introduces several technical innovations that enhance its performance across multimodal tasks. A novel model architecture integrates elements from decoder-only multimodal LLMs like LLaVA and cross-attention-based models such as Flamingo. This hybrid design improves both training efficiency and multimodal reasoning capabilities. The introduction of a 1-D tile-tagging system for dynamic high-resolution images further boosts the model’s performance in OCR-related tasks.

Additionally, the training process for NVLM 1.0 was highly curated, with a focus on dataset quality and task diversity, rather than sheer scale. This strategy proved effective in enhancing the model’s math and reasoning capabilities. NVLM 1.0’s production-grade multimodality is one of its most notable features. It excels in vision-language tasks without compromising its text-only performance.

Also Read: OpenAI Co-founder Durk Kingma Joins Anthropic in Major AI Shift.

Tweet55SendShare15
Previous Post

ETFSwap (ETFS) Presale Goes Viral On Reddit And Twitter, Top Crypto Influencer Says “It’s Every Investor’s Dream”

Next Post

OpenAI Closes Funding at a $157 Billion Valuation, Setting New Record

Reshab Agarwal

Reshab is a tech-enthusiast who likes to write about all things crypto. He is a Bitcoin bull and believes in a decentralized future of finance. Follow him on Twitter for more!

Recommended For You

Microsoft Plans Fresh Round Of Layoffs Hitting Xbox, Sales And Consulting As AI Spending Surges Past $100 Billion

by Rounak Majumdar
July 1, 2026
0
Microsoft Plans Fresh Round Of Layoffs Hitting Xbox, Sales And Consulting As AI Spending Surges Past $100 Billion

Microsoft is preparing another significant workforce reduction, with the announcement expected as early as next week. The software giant is planning to cut under 2.5% of its global...

Read more

Hijacking the Cloud Cargo Thieves Target AI Data Center Supplies in Nationwide Heists

by Anochie Esther
July 1, 2026
0
cargo thieves target AI data center supplies

The rapid buildout of global artificial intelligence infrastructure has created a multi-billion-dollar logistics pipeline, keeping thousands of high-value freight trailers on the road at any given time. However,...

Read more

OpenAI Launches Codex Micro Keypad As Its First Hardware Product, Jony Ive’s Device Still Delayed

by Rounak Majumdar
July 1, 2026
0
OpenAI Launches Codex Micro Keypad As Its First Hardware Product, Jony Ive's Device Still Delayed

After more than a year of speculation about screenless wearables, AI pins, and a possible "AI phone," OpenAI has finally revealed its first piece of hardware — and...

Read more
Next Post
OpenAI researchers warn of 'catastrophic harm' after the company opposes the AI safety bill

OpenAI Closes Funding at a $157 Billion Valuation, Setting New Record

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at info@techstory.in

Advertise With Us

Reach out at - info@techstory.in

Aviator Game India 2026

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple Artificial Intelligence bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News OpenAI samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2025 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2025 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?