• Send Us A Tip
  • Calling all Tech Writers
  • Advertise
Thursday, June 11, 2026
  • Login
TechStory
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to
No Result
View All Result
TechStory
No Result
View All Result
Home Future Tech AI

Nvidia’s New AI Model is Ready to Rival GPT-4, Setting New Standards in AI

by Reshab Agarwal
October 3, 2024
in AI, News
Reading Time: 3 mins read
0
AI Models of Open AI and Anthropic will undergo testing before US rollouts
TwitterWhatsappLinkedin

Nvidia’s new AI model is ready to rival GPT-4 in both vision-language tasks and text-only performance. Nvidia has launched a groundbreaking open-source AI model that directly competes with proprietary systems from major tech players like OpenAI and Google. The company’s new NVLM 1.0 family of large multimodal language models, led by the powerful NVLM-D-72B, is designed to excel in both vision and language tasks, while also improving text-only performance.

You might also like

Salesforce Cuts Jobs, Offers Generous Severance Package

Corporate Divergence Sam Altman’s Eyeball-Scanning Startup Downsizes as OpenAI Files for Historic IPO

Elon Musk’s xAI and SpaceX Hit with Massive Mississippi Class Action Lawsuit Over Data Center Noise Nuisance

Nvidia’s latest NVLM 1.0 models aim to deliver cutting-edge results across various domains, especially in vision-language tasks. The company claims its new models rival top-tier proprietary systems, such as GPT-4. Nvidia’s move to publicly release model weights and commit to providing the training code signifies a shift in the industry, as most advanced AI models remain closed from public access. This decision gives developers and researchers an unprecedented opportunity to explore and innovate using high-performance AI systems.

NVLM-D-72B: Excelling in Visual and Textual Inputs

One of the standout features of NVLM-D-72B is its adaptability in handling both visual and textual inputs. The model’s capacity to interpret images, memes, and step-by-step math solutions sets it apart from other AI systems. Moreover, it improves its accuracy in text-based tasks after undergoing multimodal training, a challenge for many similar models. While other systems often see a drop in text performance after such training, NVLM-D-72B achieved a 4.3-point increase across text benchmarks.

Interestingly, by excelling in math, coding, and reasoning tasks, Nvidia’s new AI model is ready to rival GPT-4 in advanced multimodal capabilities.

The open-source release of NVLM 1.0 has sparked positive reactions from the AI community. One researcher noted that Nvidia’s NVLM-D-72B model performs similarly to other leading models, such as Llama 3.1 405B, in areas like math and coding, while also having strong capabilities in visual tasks.

Architectural Innovations and Industry Implications

NVLM 1.0 introduces a new architectural approach, blending various multimodal processing techniques. This hybrid method could influence future AI research and development. Nvidia’s decision to release such a model openly challenges the conventional business models of tech companies that keep their most advanced systems closed.

While this move opens doors for innovation, it also raises concerns about misuse and ethical implications. As powerful AI technology becomes more accessible, the need for responsible use and regulation grows.

Qualitative Capabilities of NVLM-D-72B

Nvidia’s new AI model is ready to rival GPT-4 by incorporating innovative architectural designs that boost efficiency. Nvidia’s NVLM-D-72B model showcases its versatility through a range of multimodal tasks, including optical character recognition (OCR), reasoning, localization, and world knowledge application. For instance, the model can understand complex visual humor, such as memes, by performing OCR to identify text and using reasoning to grasp the joke. In one example, NVLM-D-72B accurately interpreted the humor behind a meme comparing an “abstract” and a “paper” by analyzing visual cues and text.

The model also excels in answering location-sensitive questions, solving mathematical problems step-by-step, and generating detailed descriptions of images. These capabilities position NVLM-D-72B as a powerful tool for both visual and textual reasoning tasks.

Key Technical Highlights

Nvidia’s NVLM 1.0 introduces several technical innovations that enhance its performance across multimodal tasks. A novel model architecture integrates elements from decoder-only multimodal LLMs like LLaVA and cross-attention-based models such as Flamingo. This hybrid design improves both training efficiency and multimodal reasoning capabilities. The introduction of a 1-D tile-tagging system for dynamic high-resolution images further boosts the model’s performance in OCR-related tasks.

Additionally, the training process for NVLM 1.0 was highly curated, with a focus on dataset quality and task diversity, rather than sheer scale. This strategy proved effective in enhancing the model’s math and reasoning capabilities. NVLM 1.0’s production-grade multimodality is one of its most notable features. It excels in vision-language tasks without compromising its text-only performance.

Also Read: OpenAI Co-founder Durk Kingma Joins Anthropic in Major AI Shift.

Tweet55SendShare15
Previous Post

ETFSwap (ETFS) Presale Goes Viral On Reddit And Twitter, Top Crypto Influencer Says “It’s Every Investor’s Dream”

Next Post

OpenAI Closes Funding at a $157 Billion Valuation, Setting New Record

Reshab Agarwal

Reshab is a tech-enthusiast who likes to write about all things crypto. He is a Bitcoin bull and believes in a decentralized future of finance. Follow him on Twitter for more!

Recommended For You

Salesforce Cuts Jobs, Offers Generous Severance Package

by Afeefa Ansari
June 11, 2026
0
Salesforce

Salesforce, one of the world’s largest cloud software companies, has just announced another round of job cuts as it continues to reshape its workforce around artificial intelligence and...

Read more

Corporate Divergence Sam Altman’s Eyeball-Scanning Startup Downsizes as OpenAI Files for Historic IPO

by Anochie Esther
June 11, 2026
0
Sam Altmans eye scanning startup layoff

A striking tale of two corporate trajectories is playing out across the tech sector. While generative artificial intelligence continues to attract historic waves of investment, other foundational tech...

Read more

Elon Musk’s xAI and SpaceX Hit with Massive Mississippi Class Action Lawsuit Over Data Center Noise Nuisance

by Anochie Esther
June 11, 2026
0
xAI data center noise lawsuit

A major legal battle has emerged at the intersection of the artificial intelligence boom and environmental regulation. According to a Reuters report made public on June 9, 2026,...

Read more
Next Post
OpenAI researchers warn of 'catastrophic harm' after the company opposes the AI safety bill

OpenAI Closes Funding at a $157 Billion Valuation, Setting New Record

Please login to join discussion

Techstory

Tech and Business News from around the world. Follow along for latest in the world of Tech, AI, Crypto, EVs, Business Personalities and more.
reach us at info@techstory.in

Advertise With Us

Reach out at - info@techstory.in

Aviator Game India 2026

BROWSE BY TAG

#Crypto #howto 2024 acquisition AI amazon Apple Artificial Intelligence bitcoin Business China cryptocurrency e-commerce electric vehicles Elon Musk Ethereum facebook funding Gaming Google India Instagram Investment ios iPhone IPO Market Markets Meta Microsoft News OpenAI samsung Social Media SpaceX startup startups tech technology Tesla TikTok trend trending twitter US

© 2025 Techstory.in

No Result
View All Result
  • News
  • Crypto
  • Gadgets
  • Memes
  • Gaming
  • Cars
  • AI
  • Startups
  • Markets
  • How to

© 2025 Techstory.in

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?