try that request if you have an iPhone 4S...
There were several interesting things to note about the iPhone 4S announcement. For me the most significant are Apple’s ability to use relatively modest hardware and still squeeze out a superior user experience compared with much Android devices using much less energy and Siri. I’ll concentrate on Siri.
Siri was purchased by Apple a few years ago and they’ve clearly done a lot of user interface design and speech recognition work, but Siri has a very interesting background - DARPA”s PAL project.
The Personalized Assistant that Learns was a massive five year program that was managed by the Stanford Research Institute (SRI). DARPA focuses on blue sky projects that have the potential to provide enormous change - the internal benchmark is any project should have a tenfold improvement in performance over what current exists - or it should open an entirely new area. DARPA projects have included the GPS system and what became the Internet. They’ve also had a lot of failures as many of these are very speculative, but DARPA is all about getting a few home runs. The only industrial organizations since WWII that have had such speculative projects have been Bell Labs and IBM’s Watson Labs - and both of those pretty much abandoned the highly speculative route by 1980.
The SRI folks called the project the Cognitive Assistant that Learns and Organizes - CALO for short. In their talks they would mention the name was from the Latin calonis - or soldier’s servant. It was clear where this might be used.
It was very multidisciplinary - I’m looking at one of their foil stacks.1 “Over 300 researchers from 25 of the top research Universities in the US with the goal of building a new generation of cognitive assistants that can reason, learn from experience, be told what to do, explain what they are doing, reflect on their experience, and respond robustly to surprise.”
also from the foils
CALO assists its user with six high-level functions:
- Organizing and Prioritizing Information: As the user works with email, appointments, web pages, files, and so forth, CALO uses machine learning algorithms to build a queryable model of who works on which projects, what role they play, how important they are, how documents and deliverables are related to this, etc.
- Preparing Information Artifacts: CALO can help its user put together new documents such as PowerPoint presentations, leveraging learning about structure and content from previous documents accessed in the past.
- Mediating Human Communications: CALO provides assistance as its user interacts with other people, both in electronic forums (e.g. email) and in physical meetings. If given access to participate in a meeting, CALO automatically generates a meeting transcript, tracks action item assignments, detects roles of participants, and so forth. CALO can also put together a "PrepPak" for a meeting containing information to read ahead of time or have at your fingertips as the meeting progresses.
- Task Management: CALO can automate routine tasks for you (e.g. travel authorizations), and can be taught new procedures and task by observing and interacting with the user.
- Scheduling and Reasoning in Time: CALO can learn your preferences for when you need things done by, and help you manage your busy schedule.
- Resource allocation: As part of Task management, CALO can learn to acquire new resources (electronic services and real-world people) to help get a job done.
Very disruptive stuff if you can pull it off. There were several commercial spinoffs from the work. One was a small company called Siri (a play on SRI). Their intention was to apply some of the tech to mobile phones. Ideally some mobile tasks could have a viable speech interface - this would get around some of the problems of typing, multiple clicks and screen loading. This clearly doesn’t work for all tasks, but there is great potential for many.
Apple bought Siri for something like $200M a few years ago. There was obviously a lot of user interface work as well as getting untrained speaker recognition working. There is clearly a lot of work to be done. At the moment their information supplying partners are not very rich. Wolframalpha is interesting as they provide query responses in the form of images to prevent search engines from mining them.
I was amazed by the fact that it worked at all in some of the Youtube videos I’ve seen. It is not clear to me where this fits in - and that is largely as I haven’t been able to live with it and think about it a lot. That will have to wait until June for my mobile contract to expire. There is real potential. I would guess Siri a year from now will be a very different service.
It would be interesting to see how the computational load is distributed. Just what is happening on the iPhone and how much time is allotted for the various critical steps.
Really interesting stuff!! You see thing like this and realize that, going forward, the mobile phone makers without interesting platforms are toast. Features like Siri may differentiate and seriously enhance Apple's iOS platform even if it is far from a polished state now. Only time will tell...
Oh - Siri is a not too uncommon Nordic women's name. I know a Norwegian woman named Siri. She has a very quirky personality and I would hope Apple builds a lot of quirkiness into their Siri.
_______
1 I'm certainly not an expert in AI or NLP, but worked in a research center that had AI and strong NL departments and follow some of the work in industy and moreso at DARPA.
If anyone wants to buy me an unlocked iPhone 4S I'll happily play and investigate:-) Sadly my AT&T mobile contract is far too restrictive for me to consider getting a new phone before June, 2012.
Ask Siri the average airspeed of an unladen swallow - She apparently has never watched "Monty Python and the Holy Grail"
Posted by: Howard Greenstein | 10/16/2011 at 08:43 PM
ha! and strange as wolframalpha gets it right (and quotes monty python)
They should crowdsource adding strange bits of fiction culture
Posted by: steve | 10/16/2011 at 09:12 PM
I should note that people are citing this video as evidence Siri doesn't work and that it does. Clearly a distribution in expectations. I suspect it is first useful for doing things like setting alarms and texting while driving - things that take too much screen attention, but are easily articulated with a very small universe of possibilities. It would also potentially make an interesting interface for watching programs on the big screen.
Posted by: steve | 10/16/2011 at 09:16 PM