Last night I read two papers that had been recommended earlier in the day. They couldn't have been more different. One was patently wrong and raised questions about reporting and the other left me with a variety of questions I hadn't considered keeping me up nearly 'til dawn playing.
First the one that bothers me:
Epidemiological modeling of online social network dynamics
John Cannarella1, Joshua A. Spechler1
1 Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ, USA
The last decade has seen the rise of immense online social networks (OSNs) such as MySpace and Facebook. In this paper we use epidemiological models to explain user adoption and abandonment of OSNs, where adoption is analogous to infection and abandonment is analogous to recovery. We modify the traditional SIR model of disease spread by incorporating infectious recovery dynamics such that contact between a recovered and infected member of the population is required for recovery. The proposed infectious recovery SIR model (irSIR model) is validated using publicly available Google search query data for “MySpace” as a case study of an OSN that has exhibited both adoption and abandonment phases. The irSIR model is then applied to search query data for “Facebook,” which is just beginning to show the onset of an abandonment phase. Extrapolating the best fit model into the future predicts a rapid decline in Facebook activity in the next few years.
A non-peer reviewed paper that predicts Facebook will pretty much die by 2017 was apparently mentioned everywhere and given enough credibility that six people forwarded the link to me. You can read it for yourself - it isn't very deep or technical. Basically a couple of people from Princeton use a simple epidemiology model, adjust some parameters to fit Myspace's rise and fall, scrape Google for some information on Facebook and make their prediction.
It is wrong in so many ways.1 It is bad enough that I wonder if it was intended to be a bogus piece to test how good online media is at filtering "scientific" results.2 The last point troubles me more than the first as science utimately regulates itself. A check of online sites showed many picked it up and reported it as if it was valid. Many people saw it and probably believed it to be solid work based on their trust of whatever website they were reading and the paper's Princeton pedigree.
The second paper is more interesting and useful. Non-scientists often think papers exist to report results. In fact there are a number of reasons, but some are written to summarize thinking and stimulate others into thought. These are common when in new fields as well as fields undergoing enormous change. Workshops are full of talks and papers like this and tend to be exciting places as a result.
Exobiology - life elsewhere in the Universe - is currently red hot with the discovery and early analysis of exoplanets as well as some intriguing possibilities of life on places like some of Jupiter's moons. Until very recently most work has been focused on looking for solar systems with planets in the so-called habitable zone. Our conceptions of life tend to be biased by the sample we're familiar with, so we're looking for Earth-like planets or moons in regions where there would be liquid water - something everyone agrees upon.
The Earth is clearly in a habitable zone, after all, we are an existence proof. But what if we aren't in an optimal habitable zone? What if there are places that are much better? What might that mean for the search for extraterritorial life?
About a year ago a couple of papers suggested that we are at the edge of such a zone - that if the Sun was a bit brighter, or we were a bit closer, that life would be toast. The paper I read expands on that idea and suggests some characteristics that might make a planet or moon superhabitable.
Superhabitable Worlds Hypothesis Article
René Heller1 and John Armstrong2
To be habitable, a world (planet or moon) does not need to be located in the stellar habitable zone (HZ), and worlds in the HZ are not necessarily habitable. Here, we illustrate how tidal heating can render terrestrial or icy worlds habitable beyond the stellar HZ. Scientists have developed a language that neglects the possible existence of worlds that offer more benign environments to life than Earth does. We call these objects “superhabitable” and discuss in which contexts this term could be used, that is to say, which worlds tend to be more habitable than Earth. In an appendix, we show why the principle of mediocracy cannot be used to logically explain why Earth should be a particularly habitable planet or why other inhabited worlds should be Earth-like.
Superhabitable worlds must be considered for future follow-up observations of signs of extraterrestrial life. Considering a range of physical effects, we conclude that they will tend to be slightly older and more massive than Earth and that their host stars will likely be K dwarfs. This makes Alpha Centauri B, which is a member of the closest stellar system to the Sun and is supposed to host an Earth-mass planet, an ideal target for searches for a superhabitable world. Key Words: Extrasolar terrestrial planets—Extraterrestrial life—Habitability—Planetary environments—Tides. Astrobiology 14, 50–66.
I won't go into detail, but the authors begin by looking at tidal heating as a way to create enough energy to keep water liquid in cold regions. Nothing novel about this, but they go on to talk about some system characteristics. A few are:
° a larger world than Earth with more surface area for life. But not too large. They argue a mass of two to three times that of Earth is optimal.
° lots of shallow water and long coastlines. Single supercontinents and huge oceans are probably not a good thing. (a more massive planet gravitationally favors shallower seas)
° oxygen should be greater than Earth's 21% as that places a limit on organism size, but less than 35% where runaway fires would be too much of a problem.
° a fairly strong magnetic field - better than Earth's would be good - to provide shielding from ionizing radiation
° a long lived stable star .. smaller than our Sun is probably best
° limited plate activity to induce long time scale carbon-silicate cycles
Other ideas that are currently popular are questioned. If you are into this sort of thing I recommend giving it a read.
It is important is that the authors are not reporting a result, but rather is reporting a serious of questions they've been playing with. It is an open invitation to others to join and to come to conclusions strong enough that eventually testable models can be made for the next generation of space-based telescopes or even probes (to Io for example).
These papers point out some paths and cautions for organizations using "big data." People doing the work need to be expert enough in the domain they're working in as well as having a solid tool competence. They need to be skeptical and they need the time and resources to play with a number of hypothesis. Too many times I've seen data massaged to fit a corporate direction. Good analysis often leads to new questions - questions that can make a difference. Management needs to be sufficiently versed to appreciate and even question the results.
The first paper shows a failure of questioning and logic as well as a failure of the tech press. The second shows two researchers playing with ideas and encouraging others to join the fun. Guess which one is ultimately more productive?
I'm both excited by and leery of large scale data analysis. It is damned hard to get right and there are a lot of tools that bury the details of what's going on under the hood enough to lead to bad results. People are neurally wired for apophenia and there there can be cultural pressures that force it. A strong natural skepticism is essential. An organization requires good people before snazzy tools. Fortunately the bar is being lowered for getting into the tools. R has been terrific and now advances in some extensions to Python are bringing basic tools to undergrads and there appears to be a good deal of enthusiasm. The computational tools are nice (name your favorite ball using sport) balls - you still need a lot of experience before you can play well and many people will never hack it. It probably makes sense to have at least a few people who think like this for fun - and people from very different fields.
I'll end with a very bad physics joke.
There is a psychology experiment where a person very attractive to the subject waits on a table. The subject is told they will start out 20 feet apart, but five minutes later it will be moved to half the distance and every five minutes the distance will again be reduced by half the remaining distance.
The first subject is a mathematician. Upon hearing the plan he gets angry and walks off in disgust muttering "I can't wait forever!"
The next subject is a physicist. She also hears the plan and excitedly agrees. The experimenter says "don't you know you'll never get there?" She replies - "sure, but I'll be close enough..."
Close enough is very important to physicists - especially when they play. There are several tools - fermi calculations and dimensional analysis are two important examples - that allow you to rapidly sort out if something makes sense.3 If it makes sense, you can think about going in more deeply. But if you are going through hundreds of ideas to get to one worth thinking about more deeply it is important to be parsimonious with your time. This sort of approach can be extremely useful in making sense of large amounts of data and getting to the right questions.
1 It doesn't deserve a detailed takedown, but a few points. The authors posit the decline of a social network is similar to the spread of an epidemic - eg. that leaving is symmetric with joining. They don't offer any proof (plus it isn't). They manage to show it works if the parameters are fiddled for one specific case - namely Myspace and imply that is sufficient. It isn't. Then they claim FB is showing decline by pulling their data from a very noisy source - Google Trends. They make some bad ad hoc arguments and produce a plot that shows FB has peaked when all other evidence is to the contrary. Then they imply this model just works for social networks.
Seriously. I wouldn't give this a passing grade as an undergrad paper.
2 The last sentence reads: "In addition, the authors acknowledge Professor Craig B. Arnold for fruitful golf discussion."
3 It is curious. I'm a physicist and am often asked to look at technologies or some sort of technique. Much of the domain is practiced by engineers and often that may be the best approach. But the physics approach is often very different and sometimes offers valuable insight. I'm struck by the difference in approach some times. Physicists tend to do a lot of simple play early on.