bookmark_borderHere’s an Ethical Open-Source Alternative to Alexa


When I was young, my parents gave me cassette tapes of old-time radio broadcasts to listen to. One of my favorites was a 1948 episode of Quiet Please. In “The Pathetic Fallacy,” an engineer named Quinn brags (in glorious ’40s tech-speak) to a pair of journalists about the giant computing machine that his organization has constructed:

The actual machine is behind those walls. Three rooms full of tubes and motors and stroboscopes and several thousand miles of wiring and some devices that are not public property yet. The machine took six years to build, and a total of eighty-one expert technicians were employed continually during that time. So you can understand that any one man knows very little about the actual construction of this giant, mechanical brain.

Quinn’s fortunes plummet when he realizes that “the machine” has become self-aware. It has grown angry with him. It can speak. It sabotages his career until he almost changes his name and flees into hiding. Finally, we learn that the machine, a mighty assemblage of gears and motors, is in love with Quinn and only desires his love in return.

The talking machines of today are small by comparison but are every bit as disruptive and incomprehensible as “the machine” of my childhood radio drama. As with video calls and virtual reality, digital assistants have completed the transition from the stuff of ’40s science fiction to a banal, almost innocuous standard. Like the machines of old, digital assistants occupy many rooms. They perch on our bedside tables, mingle with the paper towels on our countertops, and crouch beside our televisions. An expected 1.8 billion digital assistants will be in homes and offices by 2021, listening patiently to our voices and trying to help their companies to understand our thoughts and desires.

Source: Bloomberg Businessweek

Source: Bloomberg Businessweek

From a routine request for map directions to a casual sext, from looking up the age of a celebrity to looking up the symptoms of the novel coronavirus, digital assistants report upon and help technology companies profile their users in order to better target them with ads or to build services. Many users don’t realize that, in addition to faceless algorithms, teams of human contractors are assigned to review sound files extracted by digital assistants. They hear all the myriad recordings of our daily life, both intentional and unintentional, once a machine is activated. These analysts report hearing the things that we do not intend for others to hear, from farts, to sex, to worried conversations around the dinner table.

The information age is all about the growth and spread of data, we were told, but somehow the sacred, private mundane was supposed to be left off the table. Not so much.

Wherever there are privacy concerns and fears of big tech, however, there is a scrappy open-source alternative. In the case of digital assistants, one of the major open-source efforts is Mycroft AI. Mycroft is a digital assistant like Alexa or the Google Assistant, created by a company that has been working to get a consumer-grade product out the door since a 2018 Kickstarter campaign raised $400,000. As of this writing, the hardware has been delayed, but the software (which can be downloaded for free) is very much available.

Technology is at its best when it acts as an extension of its user, existing purely to serve a function.

As with many open-source alternatives, adopting a tool like Mycroft over big tech competitors requires many trade-offs. Mycroft sometimes struggles to understand the user, and the software’s ability to parse actual human speech in all of its halting, unstructured glory is often lacking. What distinguishes the creators of Mycroft from big tech companies, however, is the way that it develops its product on the basis of consent. Part of the reason that Google, Siri, and Alexa are so good at understanding humans is that they have such a big corpus to work with. They absorb millions of utterances from their users, capturing voices en masse, typically without the user knowing that their voices are used to train the computer models. (The difference between Apple and Google or Amazon is that it’s not part of Apple’s business model to associate the data with a personal profile, so it takes steps to anonymize the sound files. Each of these companies has developed a system to have humans review the sound files.) The result is an outcome emblematic of big tech — astounding products, costing only the privacy and consent of millions to produce and maintain.

By contrast, Mycroft requires users to opt in before using their voice to build a model. Mycroft has partnered with Mozilla on its Common Voice program, an initiative that allows people to volunteer their voices in order to build a speech-to-text engine available to all. The philosophy is the antithesis of the old Facebook maxim “move fast and break things.” Open-source developers tend to move slowly and build carefully, so that the final product doesn’t require a Faustian exchange of privacy for convenience. As a result, you have a product that is a little rough around the edges but much more pleasant to use.

I’ve written before about how the trade-offs of open-source initiatives and divorcing oneself from big tech on mobile devices are often painful, even excruciating. My experience with Mycroft has been quite the opposite — far from being tied to any big tech service, the assistant is delightfully self-contained. A few summers ago, I made a weekend of a DIY project to fit Mycroft A.I. into a hollow book. A cheap Raspberry Pi, a microphone, a set of desktop speakers, and a small screen came together to create a truly “personal” computer, a crafted device that was truly mine rather than an appendage of a large, data-hungry corporation. Today it sits beside my armchair at home, and I delight in it.

There is something immensely satisfying about developing the skills of your own digital assistant. One benefit of an “open” platform is the ability to create the code you want to run on it. For example, as an experiment, I developed a Python script for Mycroft that would scrape and report back information from a website that tracks the status of global civic freedom. It’s a script I doubt any other person would use, but it makes me feel that the device is truly my own.

Technology is at its best when it acts as an extension of its user, existing purely to serve a function. That is a rare thing for devices attached to the internet. In this context, the relationship between human and machine can most often be characterized as an exchange: The user gaining some information or service in exchange for money or personal data. While there are many users who understand this exchange with tech companies, there are plenty who don’t pore through the miles of terms of service agreements, or know how to tell these companies not to extract your data. There is a value to a service that, like Mycroft, requires opt-in to data-sharing to give users a choice about how their data is used.

The pay-with-your-data model allows many technology companies to offer their services for “free.” By contrast, sustainability issues plague many open-source software projects. The tragedy of having free, useful code is that the maintainers often don’t have the resources to support themselves long term. Infamously, the Heartbleed security flaw of several years ago was the result of the popular OpenSSL project being maintained by just one person operating with just $2,000 a year in donations. Because they typically don’t profit off user data, open-source initiatives usually subsist the old fashioned way with users giving money. And users — beguiled by pay-with-your-data models — are notoriously loath to pony up.

This week I came to the conclusion that it was worth committing to sustain Mycroft. As a person who frequently blocks all trackers and makes use of a VPN, I did the typically unthinkable: I opted in to Mycroft’s collaboration with Mozilla. My interactions with the machine will be analyzed, not only by Mycroft A.I. to improve its models of human speech, but also by any other tech company using Mozilla’s DeepSpeech engine. I also forked over for an annual $20 subscription for Mycroft, joining an open-source social network and an open-source maps tool in the list of open-source services I subscribe to. If we’re supposed to use our wallets to shape the tech sector, I want to use my dollars to promote ethical software. I’m willing to set aside my tinfoil hat and dip into my wallet for the cause.

The “Pathetic Fallacy” radio drama ends with the engineer Quinn declaring his love for “the machine,” speaking in the mathematical phrases that are the device’s natural language. Perhaps that’s just an allegory for how we’ve collectively ended up embracing technology, mercurial and ineffable as it can be. What our next steps are with the machines we’ve invited into our homes is unclear. Like Quinn, we have to make do, understanding that if we have to fall in love with computers, it’s better to go for the ones that try to love us back.