Google's new AI tool Whisk uses images as a guide

Google yet else AI tools to add the pile. Beat it is a Google Labs image generator that lets you use an existing image as your query. But its output only reflects the "essence" of your starting image, rather than creating it with new details. So for brainstorming and quick visualizations, the source is better than image edits.

The company describes Whisk as "a new kind of creative tool." The login screen starts with a bare interface with entries for style and theme. This simple login interface allows you to choose from only three preset styles: sticker, enamel pin and plush. I guess Google has found three that allow for rough outline outputs, where the experimental tool is most ideal in its current form.

As you can see in the image above, Wilford Brimley created a solid image of the plush. (Google's terms prohibit images of celebrities, but Wilford slipped through the doors with Quaker Oats without alerting security guards.)

Whisk also includes a more advanced editor (found by clicking the Start From Scratch button from the main screen). In this mode, you can use text or source image in three categories: subject, scene and style. There is also an input panel to add more text for the finishing touches. However, the advanced controls in their current form did not produce results similar to my queries.

For example, check out my attempt to create the late Mr. Brimley in a lightbox scene in the style of a walrus plush image I found online:

A screenshot of an AI generation tool that produces images of a man who looks a bit like Wilford Brimley. — Screenshot by Will Shanklin for Google / Engadget

A screenshot of an AI generation tool that generates images of a man who looks a bit like Wilford Brimley. — Screenshot by Will Shanklin for Google / Engadget

It looks like a vaguely Wilford Brimley-like actor eating oatmeal inside a flickering, lightbox frame. As far as I can tell, that dude isn't fancy. So it's clear why Google recommends using the tool more for "quick visual exploration" and less for production-ready content.

Google acknowledges that Whisk will only use "a few key features" of your source image. "For example, the generated object may have a different height, weight, hairstyle, or skin color," the company warns.

To see why, check out Google's description of how Whisk works under the hood. uses Gemini language model to write a detailed caption for the source image you're uploading. Then it inserts that description Image 3 image generator. So it's a result based image Gemini Words About Your Image — the source is not the image itself.

The whisk is only available in the US, at least for now. You can try it in the project Google Labs site.

Source link

Leave a ReplyCancel Reply

quick links

business

Entertainment