At the keynote of their IO developer conference this morning, Google has outlined their work in AI and where that work will be showing up in our Search, Workspace and even our phones. 

Gemini AI

The Google AI powered future is of course powered by their Gemini AI model, which began rolling out last year. Google has been updating Gemini across all three levels, Ultra, Pro and Nano since its launch, updating the AI model to now version 1.5, with features including multi-modality, allowing you to add a variety of sources including video, documents, audio, images and more to assess. 

The more data, the more context that the AI model can understand, and Google announced this morning that the current models will soon expand to accepting 2 million tokens worth of data (up from 1 million) later this year. 

Google AI Features

Photos

One of our favourite Google products is Photos, and Gemini will soon be powering your queries in Google Photos with a new feature called ‘Ask Photos’.

Ask Photos will let you search your Photo library for details including specific memories, or information which is included in your gallery.

This includes searches for information you would normally be able to find in your gallery, with Gemini understanding the context of the question such as ‘What is my licence plate number’, and delivering the info. 

Searches powered by Gemini will also extend to information you’d find with context in your gallery, such as asking it to find out when your child learnt to swim. In their on-stage demo. Gemini was able to gather photos into a gallery, including photos of things like the child swimming in pools and oceans, with a copy of her swimming certificate included.

Google Search is of course the foundation of Google and it’s going to be seeing some very big Gemini AI powered changes moving forward to help improve your searches.

AI Overviews

AI Overviews are generated based on your searches, offering a quick summary of the answers to your search and offering links to further research the topic. 

Google says you’ll soon be able to adjust these AI Overviews to suit you, with options to simplify language, making it easier to show the results to children or if you’re new to a topic. 

Searches can take advantage of the long context ability of Gemini, with complex questions asked in Search simply added into a large context query. An example of this, was a query used in the demo was ‘Find the best yoga or pilates studios in Boston and show me details on their intro offers, and walking time from Beacon Hill’.

The AI Overview results showed all of this using multiple sources including the Google Maps reviews and more, with the results showing a map of possible matches and summary of intro offers.

Google says they’ll be rolling AI Overview out to everyone in the US today, with more countries coming soon.

Plan Ahead

Planning ahead is soon going to be a big thing in Search, with Google announcing support for planning trips and meals. 

Planning capabilities will be included in search, with options including ‘Plan a 3 day meal plan for a group that’s easy to prepare’. You’ll receive a suggested plan, which you can then adjust the query to account for things like dietary restrictions, and more. 

Once you’re happy, you can export the meal plan to Gmail or Docs to allow for easy sharing, and Google did suggest that it may be easy, one day, to just send this to your grocery provider and have it all just work. 

Google says they’re starting with vacations and meal planning in Search Labs in English in the US, but will look to expand this to parties, date nights and workouts – though haven’t said when this will appear outside the US. 

Video AI interpreter

Sometimes a text or voice search won’t do, now Google is introducing the option to use video as a search query. 

In their demo, Googler Rose Yao was able to show a broken Audio-Technica LP player and used a video demonstrating what was happening to search. The Gemini powered search was able to assess what was happening, and offer an AI overview result with links for further research.

Looks like this feature is staying US-only – for now, with the feature available in English in Labs, but Google says they’ll look to expand to more regions ‘over time’.

Android

The Google I/O Keynote was primarily focused on their AI efforts, and Android is a key place where we’ll see a lot of these advancements on mobile. In a brief entry at the end of the Keynote, President of Android Ecosystem, Sameer Samat described Android as ‘the best place to experience Google AI’.

Samat said that Android is being reimagined with AI at the core, with this being brought together in three ways this year starting with 

  1. Putting AI powered search right at your fingertips. 
  2. Gemini is becoming your new AI Assistant on Android.
  3. Harnessing on-device AI to unlock new experiences that work as fast as you do, while keeping your sensitive data private. 
Circle to Search for Students

The new Circle to Search for Students will help them (and you) with physics and math

 word problems, with the results offering step-by-step instructions on how to solve the problem, giving students a deeper understanding of the answer.

Google says they’ll be adding in more capabilities for this later this year, including being able to solve ‘problems involving symbolic formulas, diagrams, graphs and more’.

Gemini on Android

Google is also improving the Gemini Nano AI model. Features will include more contextual options, based on what’s on your screen, and a new multi-modal update coming to Pixel later this year. 

Improvements to context will see Android users able to use Gemini over the top of apps in more ways. Soon you’ll be able to generate, then drag and drop images straight into Gmail, Google Messages and other apps, or tap ‘Ask this Video’  in YouTube to find specific information.


Google is also baking in features for Gemini Advanced, their paid Gemini option which costs $32.99AUD/month (after a 2-month free trial) which will give you options in Gemini to ‘Ask this PDF’ in PDFs to find answers.  

The Multimodal support on Android will roll out to Pixel users as part of a Gemini Nano update later this year. This means that not only will your phone be able to process text input, but also allow you to add more information including sights, sounds and spoken language. 

Multimodal capabilities will help people who experience blindness or low vision, with TalkBack, Google’s image description feature, using AI to get richer and clearer descriptions of what’s in an image – and because it’s on-device, it happens fast and without any network connection.

Alerts for Scam Calls!

With the prevalence of scam calls rising, Google will be using the on-device Gemini Nano AI model to scan your calls, and provide real-time alerts if it detects a possible scam.

The detection is all done on-device, soothing privacy concerns, with Gemini able to detect suspicious activity on the fly. During their on-stage demo, a call was received and after a ‘bank representative’ offered to move funds to another account – the Google Scam detection triggered and offered an alert on the display.

Further soothing privacy concerns, Google says that this feature will be launched as ‘opt-in’ when it arrives later this year, likely in Android 15. 

Android 15

We’re all keen to see more on Android 15 and the new upcoming features, and as a final sign off, Dave Burke announced that the Android 15 Beta 2 release is coming tomorrow. 

Google Workspace

Google has also announced more Gemini features for Google Workspace. On the desktop, you’ll see Gemini in the Workspace side panel, and for mobile users there’s Gemini features coming to the Gmail app for Android. Google is also expanding their ‘Help me write’ feature to more languages, with support for Spanish and Portuguese coming to the desktop.

In Workspace, Google is bringing Gemini 1.5 Pro, their latest version of the AI model. Gemini 1.5 Pro brings with it a longer context window and more advanced reasoning, allowing it to address larger datasets and provide clearer and more insightful responses. Google gives an example of scanning all the emails from your childs school (there’s always loads of those) and getting a summary.

For Gmail Mobile users, Gemini will soon add three new features, with the ability to summarise emails, more advanced replies including contextual awareness from email threads, and Gmail Q&A, which will let you query summaries of emails.

Generative AI

There’s an absolute plethora of options for generative AI these days, and Google has announced updates to their tools, including announcing a new video generation tool, as well as updates to their image generation and music creation tools. 

For video generation, Google announced VideoFX to go alongside ImageFX (image generation) and MusicFX (surprise! Music generation). 

Powered by Veo, their latest video generation model, VideoFX can generate 1080p resolution videos using a variety of cinematic styles and can offer shots including timelapses and aerial shots of landscapes. 

You’ll be able to work from Storyboards which will let you iterate scene by scene, as well as add music to your final video.


Unfortunately it looks like access to VideoFX is limited, with Google selecting creators from the US from a waitlist for private previews. 

In ImageFX you’ll soon see more editing controls, as well as images created by Imagen 3, their latest text-to-image model. 

ImageFX users will, from today, be able to see editing controls, allowing you to add, remove or change specific elements in your images by simply brushing over them. 

Imagen 3 will be able to generate images with ‘an incredible level of detail, producing photorealistic, lifelike images, with far fewer distracting visual artifacts than our prior models’ says Google.

Google says they’ve also improved the text translation so that Imagen 3 understands natural language to better improve results, and can also render text more accurately – something that’s an issue for AI generated images. 

Access to Imagen 3 is limited for now though, with a select few creators able to check it out – though you can try to join their waitlist – or wait for Imagen 3 to come to their AI studio suite Vertex AI.

For musicians, Google has added new DJ Mode to MusicFX, as well as generative music tools to Lyria. 

In MusicFX, the new tools will let you ‘unleash your inner DJ and craft new beats’ with AI-powered music creation. From the video demo it looks pretty neat, with a few styles able to be overlaid over sample:

In Lyria, Google has been working with artists Wyclef Jean, Grammy-nominated songwriter Justin Tranter and electronic musician Marc Rebillet to create a suite of music AI tools called Music AI Sandbox. 

Google says that Music AI Sandbox will allow people to ‘create new instrumental sections from scratch, transform sound in new ways and much more’. It’s pretty neat, and they’ve all released tracks from Music AI Sandbox on their respective YouTube channels