Google’s latest Gemini update brings AI music, on-device models, and — finally — citations that aren’t fictional.

Google dropped a Gemini feature update this week that, depending on your vantage point, is either a quiet flex or a sign that the everything-engine approach to AI is accelerating faster than most people realise. Three new capabilities. Three very different bets. Let’s break them down.
1. AI Music Generation: DALL·E for Your Ears
Gemini can now generate custom music from text prompts. Describe a mood, a genre, a tempo — and it composes a track. It’s not going to replace Hans Zimmer any time soon (his lawyers can stand down), but it doesn’t need to. The real audience here isn’t professional composers. It’s content creators, indie game developers, and anyone who’s ever spent an unreasonable amount of time scrolling royalty-free music libraries trying to find something that doesn’t sound like hold music at a regional insurance company.
The strategic angle is more interesting than the feature itself. Google is pushing hard into multimodal generation — text, image, video, code, and now audio — all from a single model family. This isn’t about music. It’s about surface area. The more modalities Gemini covers, the stickier the ecosystem becomes. If you’re already using Gemini to draft, design, and code, why wouldn’t you use it to score your video too?
It’s the Costco model of AI: once you’re inside, everything’s right there, and leaving feels like effort.
2. Nano Banana 2: Silly Name, Serious Implications
Yes, that’s the real name. No, I don’t know what happened in the naming meeting. I can only assume someone at Google lost a very specific bet, or there’s a Willy Wonka–themed product strategy document floating around Mountain View that I desperately want to read.
But the substance underneath the branding is genuinely important. Nano Banana is Gemini’s on-device model — small enough to run locally on a phone, no cloud round-trip required. Version 2 promises improved speed and capability, which matters because on-device inference is quietly becoming one of the most consequential battlegrounds in AI.
Here’s why. Every time a model has to ping a server, you pay in three currencies: latency, privacy, and connectivity dependence. On-device models eliminate all three. Your data stays on your hardware. Responses are near-instant. And the thing works on a plane, in a tunnel, or in that one corner of your office where the Wi-Fi goes to die.
Apple’s been investing heavily here. Qualcomm is building dedicated NPU silicon for it. Google clearly isn’t sleeping. For developers and product teams, the signal is clear: the edge isn’t a niche deployment target anymore. It’s where AI gets real for most users — not in a data centre, but in their pocket.
If you’re building AI-powered products and you’re not thinking about on-device, you’re already behind.
3. APA Citations: The Boring Feature That Matters Most
This one won’t make anyone’s highlight reel, but it might be the most significant update of the three. Gemini now supports accurate APA citations when searching scientific literature — pointing you to real papers with real evidence.
Why does this matter? Because hallucinated references have been one of the most embarrassing — and genuinely damaging — failure modes of large language models. We’ve all seen the fabricated papers. The invented DOIs. The citations that read like they could exist but absolutely don’t, like a particularly convincing deepfake of academic rigour.
If Google has genuinely cracked grounded, verifiable citation — and that’s a meaningful if — it shifts the conversation from “AI as brainstorming partner” to “AI as research tool.” That’s a category upgrade. Not just vibes. Receipts.
For researchers, analysts, and anyone whose work requires evidence trails, this is worth watching closely. The gap between “helpful but untrustworthy” and “useful and verifiable” is exactly where the next wave of enterprise AI adoption lives.
The Bigger Picture
Zoom out, and the pattern is clear. Google isn’t trying to win on any single capability. They’re playing a breadth game — music, on-device, citations, code, vision, search — all under one roof. The bet is that the generalist platform wins, that convenience and integration beat best-in-class point solutions.
It’s a reasonable strategy. It’s also a risky one. Jack of all trades, master of none is a real failure mode, and OpenAI, Anthropic, and others aren’t exactly standing still in their respective lanes.
But if there’s one thing Google has always been good at, it’s building platforms that become habits. And with Gemini, they’re clearly betting that the AI assistant that does everything — even if imperfectly — is the one people stop leaving.
Whether that pans out is the trillion-dollar question. Literally.