Posted by Andy Reyes
tl;dr: Performing complicated tasks on your computer with Natural Language is the future of human-computer interfaces, and I’ve been thinking about it and planning my UCLA Computer Science career around it years before Apple brought Natural Language input to the world’s attention at their “Let’s Talk iPhone” keynote.
Of the last four years I’ve been studying Computer Science at UCLA, three of them have been spent talking to my friends and colleagues about how artificial intelligence, combined with Natural Language Processing, will be the next big step forward for the mainstream computer market.
We are currently going through another paradigm shift in how we interact with our data. In the early days of computers, we manipulated our data with punch cards. Our tactile interactions with the keyboard gave birth to the Command-Line Interface. Then came the single-point interactions of the mouse with Graphical User Interfaces. Next came touch and Multi-Touch, where manipulation of data is literally at our fingertips.
This evolution of human-computer interaction did not just come about haphazardly. With each iteration of computer interfaces, tasks that users could perform became more and more complex, while the amount of “work” required to perform these tasks became less and less. Imagine trying to use (let alone implement) a command-line version of Adobe Photoshop. Imagine if those without a digital music keyboard or MIDI controller had no alternative to using their QWERTY keyboard to compose the symphony resonating in their heart. Imagine trying to compose that same symphony using punch cards! It is clear that the purpose of every evolution in computer interfaces has been to perform more complex tasks with less work required by the user.
Now that we can manipulate our data with our fingers via Multi-Touch, what is the next evolution in human-computer interaction? I believe, and have believed for a long time, that the next step is Natural Language. Natural Language is the way all humans communicate and relate with each other. It is a way we can express our desires and describe the world around us. And soon, it will facilitate deeper and more meaningful interactions with our computers and our data.
While I have much more to say about Natural Language as the future of human-computer interaction, I cannot put them all in this one blog post. Below are just a few highlights of certain topics I think about often and am sure to expand upon on later posts.
Interactions with Metadata
Something that mice and keyboards and even Multi-Touch never allowed for is the manipulation and effective use of file metadata. For example, with Natural Language input, I can say:
“Open any notes from all my Computer Science classes that have to do with NLP or human-computer interfaces I made this past school year.”
In seconds, all my Computer Science notes about NLP and HCI created this school year have popped up and are ready to be reviewed.
While this example may seem a bit simplistic, the task just performed by my computer actually took quite a bit of intelligence and cognition. It also featured utilization and knowledge of file metadata, including file location, file type, file creation date, and even the understanding that a “school year” refers to a September-June timeframe, not a January-December timeframe. This is novel, since current computer interfaces do not allow us to intuitively access and utilize file metadata aside from being able to enable filters on search results. The request also contains ambiguity: Am I searching for files that I made that are about human-computer interfaces? Or am I searching for files documenting the various human-computer interfaces that I created myself?
The above task could be performed in the current mouse-and-keyboard age by opening Find and creating the following filters:
Search in folder “Computer Science”
Kind is Text
Date created is between “September 2011” and “June 2012”
File contains “NLP” OR File contains “Natural Language Processing” OR File contains “HCI” OR File contains “Human-Computer Interaction”
The mouse-and-keyboard method of performing this task includes at least nine mouse clicks and a minimum of 98 keystrokes. And even then, this only searches for files containing the specified strings. The Find program will not return anything about NLP or HCI if those specific words do not exist in the file.The Natural Language input method is clearly superior for this type of information retrieval task.
Of course, this is not the only type of task Natural Language would be suitable for. The types of tasks that could benefit from Natural Language input are numerous and complex:
“Scale this image to six-twenty-four by three-fifty-two pixels.”
“Don’t play any more Justin Bieber songs until the house party is over.”
“Find all images of me and my wife and create a slide show set to the song ‘When I’m Sixty-Four’ by the Beatles.”
At this point you may want to call me a Natural Language Lunatic for my outrageous dreams and aspirations for the power of Natural Language and computers. I believe that the technology to carry out everything mentioned in this post (and much much more) is very close at hand. So close, in fact, that I’ve made sure to take Computer Science courses in artificial intelligence and Linguistics courses on “Mathematical Structures in Language” and “Computational Linguistics” while at UCLA. I am taking these courses, and a few more, in hopes of being able to connect the dots in the future and either bring about or creatively utilize Natural Language Processing and Artificial Intelligence technology in ways we could only dream of in movies and blog posts about the future of computer interfaces.