Howdy readers ! hope things are wonderful at your end. We are releasing our today’s newsletter in a bit hurry and yet we have to cover a lot about Google’s amazing I/O. So here are the main takeaways from Google I/O 2024.
Google I/O 2024 Updates
* Gemini 1.5 Pro
Release of Multimodal Google Gemini 1.5 Pro with 1 million token context window (soon 2 million).Benefits of bigger context windows for developers and academic researchers who have a huge pile of PDFs, audio files, and videos to analyze.
* Project Astra powered by Gemini. (The real show stopper)
Our most favorite tool Project Astra dubbed as “a universal agent helpful in everyday life.” During a demonstration, the research model showcased its capabilities by identifying sound-producing objects, providing creative alliterations, explaining code on a monitor, and locating misplaced items. The AI assistant also exhibited its potential in wearable devices, such as smart glasses, where it could analyze diagrams, suggest improvements, and generate witty responses to visual prompts.
Google says that Astra uses the camera and microphone on a user’s device to provide assistance in everyday life. By continuously processing and encoding video frames and speech input, Astra creates a timeline of events and caches the information for quick recall. The company says that this enables the AI to identify objects, answer questions, and remember things it has seen that are no longer in the camera’s frame.
* Veo – text to video generation like Sora. Yet another remarkable achievement of Google.
Google also showcased its text to video generation tool called Veo. The videos being shown were really remarkable and on par with Sora AI. Veo seems to have many editing options too that can give full control to the users. This is also not yet publicly available to use.
* Google Search – powered by Gemini
Multi-step searching for more systematic, relevant, and accurate results. Google Search with advanced AI will handle searching, researching, brainstorming, planning, and much more.
* Gemini for Workspace
Gmail advanced research, suggestions, organizing, graphical visualization, and automation. Virtual AI teammate is an AI with a customized job description in Google Chat, capable of analyzing, summarizing, and other actions.
* Android AI – AI at the core. Android will be the first OS to have AI at its core.
Gemini as AI assistant on Android – circle to search – default AI Google Search.
Gemini app on Android will have advanced features such as context awareness, interacting with users’ mobile screen activities, including watching and analyzing YouTube videos, and reading and analyzing PDF files that a user is currently viewing. Google Pixel will have Gemini Nano with multimodal features.
Android will now detect scam calls in real-time, alert you instantly, and hang up the call.
* Gemini 1.5 Pro & Gemini Flash for developers, available globally.
New features like video frame extraction, parallel function calling, context caching, etc.
- Gemini 1.5 with 1 million context window pro pricing will be $7 per 1 million tokens (multimodal).
- For a 128k context window, the pricing will be $3.5 per 1 million tokens (multimodal).
- While Gemini 1.5 Flash with 128k window context starting price will be $0.35 per 1 million tokens.
The Pro version is best for performing complex tasks and is a bit slow, while the Flash version is best for simpler, less complex tasks and is much faster in response.
- Google open-source Gemma 2 is coming in June at 27B parameters. Gemma 1 with 7B and 3B variants as open-source models are already available and downloaded from platforms like Hugging Face.
- PaliGemma – new open-source vision language model released on platforms like Hugging Face.
Other Updates
* Gemini AI Audio overviews. Reviewing text, podcast, files in audio.
* Under development agents that can perform and automate daily tasks such as returning your order, all done by AI agents.
* More advanced “Imagen 3” for image generation with super realistic results.
* Music generation tool.
* Trillium – 6th generation TPU – 4.7x improvement in compute power.
* Multimodal Gemini app – it will be equipped with Project Astra later this year.
* Personalized Gemini with new features called Gems for advanced reasoning, planning, and researching, like planning a vacation.
* Developing & using AI responsibly.
Watermarking all AI models with SynthID to identify AI-generated content to prevent misinformation like deepfakes, etc.
In short, Google showed that it will be the the undisputed true AI king. With it’s decade long research in AI and it’s established ecosystem, Google AI adaptability will be a no issue.
Romo AI Updates
In another AI update, our favorite Romo AI has released a new tool where user can enter YouTube video url and it will generate a blog post or pros cons list or main idea details in blink of an eye. Visit www.RomoAI.com and sing up for free account. You can also use coupon code “get22” to avail the exclusive discount of 22% on any of Romo AI paid plans.
Signing off
Mike Rowan
AI Nuggets