The First Open Virtual Assistant Workshop was held on October 30, 2019, as part of the Stanford HAI Symposium. The goal of the workshop is to gather together people interested in this topic to share interests in the field, the state of the art, latest research, and to learn how to further develop the field. What started out as an invitation-only workshop for 50 people turned into a meeting of 70, representing about 30 institutions and 8 different countries. Due to popular demand, the workshop was live-streamed in its entirety on Youtube.
Besides the five largest tech companies (Amazon, Apple, Facebook, Google, Samsung), we attracted the largest IT enterprises (IBM, Microsoft, Salesforce), consumer services (Comcast, Comscore, Square, Uber), startups (Cloudplugs, Consilient Lab), open-source companies (Home Assistant, Mozilla), as well as non-profits (Alfred P. Sloan Foundation, EP3 Foundation, The Open Voice Networks). Many foreign organizations are represented as well, from France (CEA - Alternative Energies and Atomic Energy Commission), Japan (SmartNews), Korea (SK Telecom), Russia (Nazarbayev University and BTS Digital), Spain (Sherpa, Telefonica), Taiwan (HTC, Trend Micro), and UK (Santander Bank).
To collect as many points of view as possible, everybody had a chance to speak, we had 20 short talks, covering open-source work and research, followed by a well-attended reception where many useful conversations took place. From the talks, discussions, advice received during and after the workshop, we concluded that “trustworthy virtual assistants” is an important and challenging discipline whose research and education depend on a robust, usable, open-source, privacy-preserving virtual assistant platform. The OVAL lab is committed to making it happen. We are seeking funding so we can couple our research with significant engineering resources to make this technology freely accessible by all for advancing science and social good.
The diversity of companies and countries represented in this small workshop suggests that there is an industry-wide interest in virtual assistants. The sentiment that the world needs better protection of privacy, expressed by Ceri Godwin (Santander), was echoed by many at the workshop. Doron Weber (Alfred P. Sloan Foundation) encouraged us to think big, “what does it take to change the world”. This is the first time we saw the possibility and importance of investment in technology for the social good of the society.
While commercial assistants today struggle with simple commands, Jayesh Govindarajan (Salesforce) discussed how conversations have abstract structures, models, and domain-independent types (informational, conformational, and transactional). We need academia to collectively develop common languages, abstractions, methodologies, and tools to eventually create assistants that can converse about the entire digital world and perform arbitrary digital tasks. As an example, the small Almond team at Stanford was able to create an assistant that can answer more complex questions and to perform more complex tasks than commercial assistants by combining machine learning with a high-level programming language called ThingTalk. The power of common languages is well understood because of how C and Java improve productivity over assembly coding.
Larry Heck (Samsung Viv Labs) inspired us with his vision that we can teach assistants new tasks via demonstrations in the future. The need for research in trustworthy virtual assistants is huge: how to create effective voice interfaces (Rob Chambers, Microsoft, Jofish Kaye, Mozilla)? How to go from simple commands to dialogs? (Hakkani-Tur, Amazon; Zhu, Microsoft). How to make good engaging and empathic conversations (See, Stanford; Chaganty, Square; Bernstein, Stanford)? How to augment speech with speaker head tracking and graphical interfaces (Yang, Stanford)? How to help users manage data sharing (Duranton, CEA; Campagna, Stanford)? How to scale and protect privacy with conversational systems that execute with edge hardware (Ravi and Kozareva, Google)? How can we help people share their data while preserving their privacy, perhaps with the use of multi-layer ledgers (Chang, HTC)?
Prem Natarajan (Amazon Alexa) spoke about the importance of open platforms for educating the workforce. This new paradigm shift to voice needs a new generation of developers who can master this new technology.
The OVAL lab has developed a fully working prototype of an open assistant, which is recently bundled with the Home Assistant gateway (Schoutsen, Home Assistant). This setup runs locally to keep data private. The Almond virtual assistant platform has these components:
While our open-source prototype demonstrates how new technology can advance the state of the art of assistants, it is not productized and hence not usable by consumers. Without users, it is inadequate as an experimental platform to support research and education. Open-source platforms have been shown to be critical in the education, development, and research in operating systems and compilers.
Given the need, complexity, and scale of trustworthy virtual assistants, our conclusion is that we need to build a usable, robust, privacy-preserving assistant infrastructure with the latest technology, and open it up to companies and academia. Even though we expect open-source contributions from many companies and individual developers, the success of such an infrastructure requires a team of professional engineers working in tandem with researchers. The resources needed are commensurate with typical venture investments, but are seldom available to academia. Investing in open intellectual property and platforms is needed to disrupt the surveillance economy and monopoly platforms that have become prevalent in the last decade. We hope to establish the open-source movement in three years, after which we expect a healthy economy with startups and enterprises building on this platform. We welcome your advice, support, and collaboration to make this project a success.