Google released Gemini 2.0 Flash Thinking to compete with OpenAI o1

Subscribe to our daily and weekly newsletters for the latest updates and exclusive content covering industry-leading AI. Learn more


Its latest push to redefine the AI ​​landscape, Google announced Twins 2.0 Flash Thinkinga multimodal thinking model capable of solving complex problems with speed and clarity.

A Message to social network XGoogle CEO Sundar Pichai wrote: "Our most thoughtful model ever :)"

And on development documentsGoogle explains: “Thinking mode has a stronger ability to reason in its answers than the database. Gemini 2.0 Flash model,” was Google's latest and greatest, released only eight days ago.

The new model supports only 32,000 inputs (approx The volume of the text is 50-60 pages) and can issue 8000 tokens per issue response. In Google AI Studio's sidebar, the company claims it's best for "multimodal understanding, reasoning" and "coding."

Details about the learning process, architecture, license and costs of the model have not yet been released. It currently shows zero cost per token in Google AI Studio.

Accessible and clear thinking

Unlike competitors' thinking models o1 and o1 mini from OpenAIGemini 2.0 allows users to access its step-by-step reasoning through a drop-down menu, providing a clear, transparent understanding of how the model arrives at its conclusions.

By allowing users to see how decisions are made, Gemini 2.0 addresses long-standing concerns about AI acting as a "black box" and brings the model - the licensing terms of which are still unclear - to parity with AI. other open source models released by competitors.

My first simple tests of the model showed that it was accurate and fast (within one to three seconds) answering some questions that were difficult for other AI models, such as calculating the number of Rs in the word "Strawberry". (See screenshot above).

In another test, comparing two decimal numbers (9.9 and 9.11), the model systematically broke down the problem into smaller steps, from analyzing whole numbers to comparing decimal places.

These results are verified by an independent third-party analysis LM ArenaGemini 2.0 Flash Thinking was named the number one model in all LLM categories.

Native support for uploading and analyzing images

Unlike the competing OpenAI o1 family, Gemini 2.0 Flash Thinking is designed to process images on the fly.

o1 launched as a text-only template, but has since expanded to include image and file upload analysis. Both models can only return text at this time.

Gemini 2.0 Flash Thinking also does not currently support integration with Google Search or other Google apps and external third-party tools. development documents.

The multimodal capabilities of Gemini 2.0 Flash Thinking expand its potential use cases and allow you to handle scenarios that combine different types of data.

For example, in one test, the model solved a puzzle that required analyzing textual and visual elements, demonstrating its versatility in integrating and reasoning across formats.

Developers can use these features through Google AI Studio and Vertex AI, where the model is available for experimentation.

As the AI ​​landscape becomes increasingly competitive, Gemini 2.0 Flash Thinking may mark the beginning of a new era for problem-solving models. Its ability to handle a variety of data types, offer outstanding inferences, and perform at scale makes it a serious contender in the AI ​​market, competing with OpenAI's o1 family and beyond.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *