I’m never going to use voice controls for my tech, sorry – and I don’t care how much better it is now thanks to AI

So Google wants me to start saying ‘Hey Gemini’ now, huh? No thanks, you can get in the sea with that nonsense. I’m not having it. Call me a Luddite, call me a curmudgeon, tell me to get with the times; I couldn’t care less, I’m not going to talk to my tech.

Now, before I get into the meat and potatoes of this article, I’d like to preface it by saying that I’m not against the existence of voice control features on the whole. They’re actually an extremely vital accessibility feature that many disabled tech users rely on to get the full experience from their hardware. But for those who don’t actually need it, like myself – what the hell is wrong with just pressing some buttons or tapping a touchscreen?

I get annoyed if someone is talking too loudly on their phone on public transport. When tech companies like Google tell me that voice control is the future of how we interact with our tech, I’m immediately filled with horror at the idea of traveling through a city where everybody is constantly barking commands at their phones and tablets.

How many people really use voice controls?

I did some research into the actual statistics behind voice control use, and was surprised at the results. I’ve literally never seen a single person use their phone to search the web for something using a voice command; sure, I’ve seen people ask their Alexa smart speaker to play music or turn off a light, something I will probably also never do because I always have a phone in my pocket that can do those things, but web searches? Really?

Apparently so: according to a 2018 study by PWC, 32% of voice assistant users ask their chosen digital helper at least one thing they’d normally use a search engine for on a daily basis, with 89% doing so at least once a month. Of course, that’s only people who already use a voice assistant, but analysis from Statista claims that almost half of Americans talk to their phones or smart speakers at least semi-regularly (though that figure reduces to about 1 in 5 on a global scale).

The thing is, as I dug further and further into these statistics, I became less and less convinced by them. For starters, the very first set of stats I came across (which I won’t link here) claimed that “8.4 billion people worldwide are estimated to use voice assistants” – that’s… more than the current total human population. I started noticing more discrepancies in the data, as well as having to discard some sources for obvious pro-tech-marketing bias.

More confused than enlightened, I had to eventually conclude that much of the statistical research into this area of tech has been based more heavily on product sales than actual unbiased polling of the population: and that’s a serious flaw, because a person who owns one piece of voice-controlled hardware is likely to own more. I have a friend who has three identical Echo Dot smart speakers positioned in different rooms around her home, and she uses Siri on her iPhone to make music requests while in the car. Me? I just have a driving playlist that I shuffle before I start the engine.

Voice control is getting better – slowly

I will admit that my usual excuse for why I abhor voice-controlled tech doesn’t hold as much weight as it used to. That excuse was, in short: it’s crap. The early days of Siri, Cortana, and their ilk were plagued by a constant refrain of “I’m sorry, I didn’t quite understand that”, but with the dawn of AI, things are starting to improve.

Tools like Apple Intelligence and Google Gemini offer multimodal input, allowing them to understand vocal requests as well as text prompts. The large language model AIs of today do a far better job of parsing spoken words than older voice-recognition software, even able to adapt to an individual user’s speech patterns over time to provide more accurate responses.

However, there are still stumbling blocks to be overcome. While voice recognition typically supports multiple languages, it frequently struggles with strong accents and speech impediments (I myself have a lisp, which doesn’t help matters). This can be due to unnoticed biases in the training data used: if an American company uses recordings of Americans speaking English to train its speech recognition AI to understand spoken English, it’s unsurprisingly going to struggle when it hears a Japanese or Swedish person speaking that language.

I do genuinely hope that one day voice controls work perfectly because the people who really need them deserve a service that works as well as simply typing a query into Google. But I won’t be using it, and I don’t want to live in a future where everybody is – you can bet I’ll be first in line to dunk on any tech company that tries to make voice commands the default mode of interacting with their product.

Related posts

This widely-used instant loan app leaks nearly 30 million files of user data

Japan Airlines cyberattack disrupts flights, but systems now seem to be back to normal

NYT Strands today — my hints, answers and spangram for Friday, December 27 (game #299)

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Read More