Earlier today I received a note from Jheri about her grades for thetwo remote courses she recently finished. I had offered to help her with differential equations if she had any problems with the course, but other than suggest and grade some interesting practice problems I didn't do anything. She is exceptionally motivated and breezed through this and a biology course. It must have been challenging as she has a career that involves a boatload (well - planeload) of international travel.
So three cheers for Jheri! I am toasting her with some proper homemade ice cream tonight.
Jheri is now in her mid 20s and was a poor student in high school. There were some reasons for this, but the bottom line is that she wasn't ready to learn. We have an education system that assumes a readiness on the part of the student across many subjects that is probably optimal for a small minority. But now she wants it and is focused on learning biology and has been finding math and science come easil to her. I'm not surprised - she has a sparkling curious mind and has discovered the wonderful ignorance that science provides.. She has a career, but is getting a bit of schooling in her spare time and will soon quit and go full time. I wouldn't be surprised to see her go on for an advanced degree.
I'm finding something similar about myself. I was extremely interested in math and science as a kid and far to focused. (another Jeri knew me a bit then and will probably vouch for that) Then I found myself as an undergrad in a school that was really a glorified trade school. I tested out of many of the "non important" subjects and continued my singular focus. Grad school was a mentor relation that focused almost entirely on physics. As a result I'm extremely narrow - something that has been bothering me a lot. I have been developing interests in history, art and music playing with these a bit as a curious amateur. Perhaps one day I'll go to a real university and learn how to ask a few questions in these areas in much greater detail. At least I am identifying my ignorance and am curious enough to learn. It doesn't matter that I don't know much - at least not to me.
Being ready to learn happens in areas you may not immediately suspect. The men's volleyball team at UC Irvine took the NCAA Division 1 championship this past weekend. Except for volleyball they have an unremarkable athletic department, but head coach John Speraw is one of those remarkable characters who is both cerebral and humble. He knows that he doesn't know and seems to delight in learning.
Speraw hasn't been with Irvine that long, but has managed to put together three national championships. During the last season he raised eyebrows when he hired a female psychologist with no volleyball experience as one of his assistant coaches - even though the program is too small to support full time assistant coaches. This weekend he praised her noting she gave him and the team the right tools to do the remarkable. Of course they, as an organization, had to be radically different from the standard men's volleyball team in order to understand, modify and implement what she was telling them. Speraw observes, experiments and learns from his successes and failures. He has turned his staff onto his techniques and the players are doing the same. UC Irvine now has a learning organization that is crushing organizations with rich histories and much larger financial resources. My guess is more than a few men's programs will hire psychologists and most (all?) of them will be too inflexible to learn and adapt. In the meantime John will find other tools.
One wonders about the parallels with other organizations - corporations for example.
When are companies ready to learn and can you identify them by the questions they are asking? Does their thinking become calcified over time and do they see a need to learn and adapt? Where do they find the questions to learn from and what are their mechanisms for dealing with the questions? Do they have an institutional learning process and an institutional memory to take advantage of their learnings? Do they recognize the value of failure? Do they recognize the value of play? Are those charged with running the company curious enough to ask questions or have they followed a path that has given them success to date giving them an artificial sense of confidence? How do they select and promote employees at all levels and is the ability and readiness to learn part of the process?
My guess is learning organizations are probably rare. I'm reasonably familiar with Apple and Pixar and both are great examples where learning is central. I'm also familiar with many others that are mostly unable - or at least unready - to learn.
Is your organization curious enough? Does it actively identify, cultivate and celebrate its ignorance? Is it diverse enough to connect dots and identify potentially new areas of ignorance - areas that, when explored by the right people will illuminate opportunity that others hadn't considered.
Think about the last paragraph and imagine Apple and Pixar. Try it again with average organizations you know.
It makes me think of how new ideas and techniques are identified and, if useful, implemented. Many companies follow bandwagons - some companies sort out the world by asking the right questions and blaze their own paths, but many become little more than buzzword compliant with a considerable amount of time, talent and money in the process.
I think data mining and "big data" (I dislike both terms for a variety of reasons - sit me down with some cold milk and fresh cookies and I'll go into detail) are a great example of the potential success and failure of institutional learning. A few organizations will understand what they are doing deeply enough to be able to play in this area effectively, but many are probably going to fail as has been true with many "next big things" over time.
You really need to understand something deeply enough to know if there is a there there. I was drawn to physics as, in principal, it is so simple. Simple enough that you have some hope of asking fundamental questions in a productive fashion. But add a bit of complexity and the world rapidly becomes far too complex to understand at a fundamental level. There may be levels of possible understanding that describe it that, if you are sufficiently clever, allow you identify those techniques that will work for your organization.
The term "data" is highly contextual - so much that I try not to use it. It isn't a fundamental piece of information, but rather is a cloud of information surrounding the core of what you think it is. Some of this cloud may be obvious and some may be hidden. As an example consider this well-known description of an email problem. The system is relatively simple (I've written successful email gateways so they must be simple), but the description assumes you have a bit of familiarity with the subject. Feel free to skip over it if your eyes glaze over - the point is there was an obscure condition that eluded detection resulting in some puzzling behavior.
The following is the 500-mile email story in the form it originally appeared, in a post to sage-members on Sun, 24 Nov 2002.:
From email@example.com Fri Nov 29 18:00:49 2002 Date: Sun, 24 Nov 2002 21:03:02 -0500 (EST) From: Trey Harris <firstname.lastname@example.org> To: email@example.com Subject: The case of the 500-mile email (was RE: [SAGE] Favorite impossible task?) Here's a problem that *sounded* impossible... I almost regret posting the story to a wide audience, because it makes a great tale over drinks at a conference. :-) The story is slightly altered in order to protect the guilty, elide over irrelevant and boring details, and generally make the whole thing more entertaining. I was working in a job running the campus email system some years ago when I got a call from the chairman of the statistics department. "We're having a problem sending email out of the department." "What's the problem?" I asked. "We can't send mail more than 500 miles," the chairman explained. I choked on my latte. "Come again?" "We can't send mail farther than 500 miles from here," he repeated. "A little bit more, actually. Call it 520 miles. But no farther." "Um... Email really doesn't work that way, generally," I said, trying to keep panic out of my voice. One doesn't display panic when speaking to a department chairman, even of a relatively impoverished department like statistics. "What makes you think you can't send mail more than 500 miles?" "It's not what I *think*," the chairman replied testily. "You see, when we first noticed this happening, a few days ago--" "You waited a few DAYS?" I interrupted, a tremor tinging my voice. "And you couldn't send email this whole time?" "We could send email. Just not more than--" "--500 miles, yes," I finished for him, "I got that. But why didn't you call earlier?" "Well, we hadn't collected enough data to be sure of what was going on until just now." Right. This is the chairman of *statistics*. "Anyway, I asked one of the geostatisticians to look into it--" "Geostatisticians..." "--yes, and she's produced a map showing the radius within which we can send email to be slightly more than 500 miles. There are a number of destinations within that radius that we can't reach, either, or reach sporadically, but we can never email farther than this radius." "I see," I said, and put my head in my hands. "When did this start? A few days ago, you said, but did anything change in your systems at that time?" "Well, the consultant came in and patched our server and rebooted it. But I called him, and he said he didn't touch the mail system." "Okay, let me take a look, and I'll call you back," I said, scarcely believing that I was playing along. It wasn't April Fool's Day. I tried to remember if someone owed me a practical joke. I logged into their department's server, and sent a few test mails. This was in the Research Triangle of North Carolina, and a test mail to my own account was delivered without a hitch. Ditto for one sent to Richmond, and Atlanta, and Washington. Another to Princeton (400 miles) worked. But then I tried to send an email to Memphis (600 miles). It failed. Boston, failed. Detroit, failed. I got out my address book and started trying to narrow this down. New York (420 miles) worked, but Providence (580 miles) failed. I was beginning to wonder if I had lost my sanity. I tried emailing a friend who lived in North Carolina, but whose ISP was in Seattle. Thankfully, it failed. If the problem had had to do with the geography of the human recipient and not his mail server, I think I would have broken down in tears. Having established that--unbelievably--the problem as reported was true, and repeatable, I took a look at the sendmail.cf file. It looked fairly normal. In fact, it looked familiar. I diffed it against the sendmail.cf in my home directory. It hadn't been altered--it was a sendmail.cf I had written. And I was fairly certain I hadn't enabled the "FAIL_MAIL_OVER_500_MILES" option. At a loss, I telnetted into the SMTP port. The server happily responded with a SunOS sendmail banner. Wait a minute... a SunOS sendmail banner? At the time, Sun was still shipping Sendmail 5 with its operating system, even though Sendmail 8 was fairly mature. Being a good system administrator, I had standardized on Sendmail 8. And also being a good system administrator, I had written a sendmail.cf that used the nice long self-documenting option and variable names available in Sendmail 8 rather than the cryptic punctuation-mark codes that had been used in Sendmail 5. The pieces fell into place, all at once, and I again choked on the dregs of my now-cold latte. When the consultant had "patched the server," he had apparently upgraded the version of SunOS, and in so doing *downgraded* Sendmail. The upgrade helpfully left the sendmail.cf alone, even though it was now the wrong version. It so happens that Sendmail 5--at least, the version that Sun shipped, which had some tweaks--could deal with the Sendmail 8 sendmail.cf, as most of the rules had at that point remained unaltered. But the new long configuration options--those it saw as junk, and skipped. And the sendmail binary had no defaults compiled in for most of these, so, finding no suitable settings in the sendmail.cf file, they were set to zero. One of the settings that was set to zero was the timeout to connect to the remote SMTP server. Some experimentation established that on this particular machine with its typical load, a zero timeout would abort a connect call in slightly over three milliseconds. An odd feature of our campus network at the time was that it was 100% switched. An outgoing packet wouldn't incur a router delay until hitting the POP and reaching a router on the far side. So time to connect to a lightly-loaded remote host on a nearby network would actually largely be governed by the speed of light distance to the destination rather than by incidental router delays. Feeling slightly giddy, I typed into my shell: $ units 1311 units, 63 prefixes You have: 3 millilightseconds You want: miles * 558.84719 / 0.0017893979 "500 miles, or a little bit more."
This example is really easy - it involved a well-defined system built around logical constructs.1 Understanding humans and human behavior is much more difficult and anyone who says they can is naïve and/or lying. Data mining may give candidate shards of information, but they are highly contextual and you must know the steps used to gather, filter and process the information and how reliable they are. You had damn well know the errors and error propagation. And even then the resulting information may not be terribly useful as human behavior isn't exactly an axiomatic system.
It is more important to develop a deep understanding of what you need to know. You must learn what it is you need to know and how to get a "good enough" approximation. For companies like Apple, Pixar, Trader Joe's, and Lululemon success doesn't rest on data mining, but on other techniques. It is likely data mining techniques wouldn't be terrible useful to these companies for their current products. It may be that Apple finds some great uses as Siri builds out, but for now great design doesn't rest on it.
I suspect the companies successful with data mining will be vendors who sell to the masses and don't really care about the results as long as the money keeps coming in along with a few companies who have learned there is a solid niche for them and can ask the right questions to know the information produced is robust enough to make a material difference in their bottom line. I suspect the later group will be made up with companies who are indeed ready to learn.
I'm very proud of Jheri! Here is the ice cream I'll make in her honor defrosting some wild blueberries that were picked last Summer at peak season. I'll follow it with a quick recipe for something really simple.
There are many ways to make ice cream - I use a simple refrigerated freezer that gets reasonably cold. Liquid nitrogren techniques are even better if you have access. Drop me a note if you are interested. Also I have more than a few great ice cream recipes from years of experimentation starting with Steve in the dorm basement at Stony Brook. We literally wore out three freezers during our experiments.
Blueberry Ice Cream
° 2 large pasteurized eggs
° 1.25 c white sugar (if the berries are extra sweet, cut the sugar a bit)
° 2 c blueberries - or flash frozen from last season
° juice of 1/2 lemon
° 2 c heavy cream
° 1 c whole milk
• toss blueberries, lemon juice, 1/2 c sugar in a small bowl, cover and refrigerate for a few hours tossing every half hour or so
• wisk eggs until fluffy, wisk in the sugar a bit at a time and then blend in the cream and milk
• mash the berry mix by hand and mix with dairy mixture
• churn freeze and season if you can stand the wait - I usually just eat it immediately!
OK now the simple recipe.
The Un-Egg Cream
Too simple to talk about ingredients or technique. I like the idea of an egg cream and this is a simple variation. Get some seltzer or, better yet, Cherry 7-Up chilled to near freezing. Pour some milk based drink, properly chilled into the glass - perhaps 50 grams or so (about two ounces). Pour about five or six times as much of the bubbly fluid on top and rapidly stir with a long thin spoon.
Very refreshing. My favorite combination is Sunkist Naturals Pina Colada Protein Smoothie and Cherry 7-Up (diet or regular ... diet is probably somewhat healthier)
1 There is a class of computer "failures" that are, at their hearts, insufficiently understood systems. Often "data" is the issue