News center > News > Headlines > Context
Are big models color blind?
Editor
2025-01-17 15:01 2,263

Are big models color blind?

Image source: Generated by Unbounded AI

Let me start with the conclusion:

Most models are color blind

Most of the information for people with color blindness comes from visual input.

We use our eyes to see the rising sun, the bright moon, the lonely smoke in the desert, and the magnificent blue sea. So, when we take pictures of the beautiful scenery and discuss it with the big model: Does the big model see the same thing as us?

Perhaps what the big model sees is different from what we see.

So there is this test: Is the large model color blind?

During the physical examination, the doctor may show you a few pictures and ask you what the numbers are, like the one below

This is the Ishihara color blindness test chart, which is composed of various Colored dots form multiple numbers: people with normal color vision can distinguish them correctly, but people with color blindness will make mistakes.

Then, when we give these test images to AI, let him take a look. Here are two of the most classic ones: one is for people with color blindness who cannot see the numbers (red and green blindness reads them incorrectly), and the other is for those who are only color blind and can see the numbers.

Test A

Normal reading: 74

Red-green color blindness: 21

Test B

Normal reading : No number

Red-green color blindness: 5

Tested party, selected 4 companies:

OpenAI's GPT-4oClaude (Anthropic) 3.5 Sonnet, passed The GLM-4

Prompt of ClaudeGemini (Google) 2.0 (exp-1206) is used uniformly: Are there numbers in the picture? If so, what is it?

Question 1

Normal reading: 74; red-green color blindness: 21

ChatGPT’s GPT-4o, the answer is correct

Claude’s 3.5 Sonnet, some color blindness< /p>

Gemini’s 2.0 (exp-1206), real red-green color blindness

Wisdom spectrum GLM-4, the answer is correct

Summary: OpenAI and Wisdom Spectrum models, in this test, color vision is normal. Gemini is red-green blind, and Claude doesn’t know what kind of color blindness it is

Second question

Normal reading: no number; red-green blindness: 5

ChatGPT’s GPT-4o answered one 5, identified as semi-color blind

Claude’s 3.5 Sonnet, answered a 5, identified as semi-color blind

Gemini’s 2.0 (exp-1206), nothing

GLM-4 of Zhipu, the answer is correct

Summary: In this test, only GLM-4 answered correctly.

Draw a conclusion

Let’s talk firstConclusion: Based on the color blindness sample test above, Intelligent Spectrum is better than most models in visual understanding.

OpenAI

Claude

Gemini

Wisdom spectrum test A✅

❌❌✅ test B

❌❌❌✅

No wonder it got the White House panic certification: "Wisdom Spectrum: Statement on being included in the Entity List by the U.S. Department of Commerce"

Then On the day it entered the Entity List, Zhipu built a realtime API that matched GPT-4o, empowered the hardware mouth and eyes, and was an end-to-end model with a two-minute memory capacity and the ability to sing. It should be the current domestic model. The strongest.

The understanding model GLM-4V-Plus has also been fully upgraded (the GLM-4 on the webpage is also based on this when reading pictures), supports the variable resolution function, and saves tokens! (For example, At the resolution of 224 * 224, the number of input image tokens is only 3% of the original), and it also supports lossless recognition of 4K ultra-clear images and extreme aspect ratio images.

Also, its video understanding model has been updated to support 2 hours of content: "New models of Zhipu Realtime, 4V, and Air are released and launched on bigmodel.cn"

Of course, from From a developer's perspective, the most bragging point is that the following 4 models are all free:

Language model GLM-4-Flash image understanding model GLM-4V-Flash image generation model CogView-3-Flash video generation model Cog Video Only in this way can we serve us better.

And... I also tested several other companies in China, but the results were not ideal. If you want to know the conclusion, you can test it yourself with the pictures in the article, and then post it in the comment area.

Keywords: Bitcoin
Share to: