Prompt API color sensitivity
I was playing with stress-testing the multimodal capabilities of the Prompt API and thought a nice test case might be to have the model read the current time painted on a <canvas>
. As with my last Prompt API exploration, I'm again using a response constraint, the HH:mm:ss
regular expression /^([0-1][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9])$/
. The prompt is "Read the time that you can see in this image and print it in HH:mm:ss format."
To my surprise, the model (Gemini Nano in Chrome) seems to be quite color-sensitive. I found that the model often gets the time wrong in dark mode when a red font is used to paint on the canvas. (The Canvas
CSS system color is #121212
in Chrome in dark mode.) I checked the contrast between CSS #ff0000
(that is, red) and CSS #121212
(that is, black-ish) and it's 4.68:1
, which for large text passes both WCAG AA and WCAG AAA.
Not something really super actionable, other than maybe a heads up to play with color-preprocessing if the model's recognition performance is poorer than you expected.
Oh, and almost forgot the results of my stress test: on my MacBook Pro 16-inch, Nov 2024 with an Apple M4 Pro and 48 GB of RAM, the model was able to keep up with about one complete (but not necessarily correct) prompt response per second. (Yes, I know that this machine is not what the average user has.)
You can play with the demo embedded below, or check out the source code on GitHub. Toggle between light mode and dark mode and choose red or CanvasText
as the font color.
Update: It's a lot worse if the canvas background color is pure black #000000
. I've updated the demo to use pure black, and have filed a Chromium bug.