Recently I read a paper co-authored by a number of social network social scientists on the dynamics of the news cycle as evidenced by a medium-grained examination of the diffusion and transformation of phrase variants across the web. The authors, Leskovec, Backstrom, and Kleinberg, have a supporting web site, Its worth a look.

Leskovec, Backstrom, and Kleinberg note that the two principal approaches to the monitoring of the diffusion and transformation of topics on the web-- hyperlink tracing and probabilistic term mixtures-- capture either short term spikes of interest in a topic or long term trends, but fail to capture the intermediate scale processes that characterize the news cycle.

The authors propose to capture this dynamic by tracking distinctive clusters of similar phrases as they occur in text on the web. To demonstrate this approach, the authors extracted quotations from 90 million articles from 1.6 million web sites from the last few months of the 2008 U.S. presidential election. Using an algorithm that creates a phrase graph based on phrase similarity and then partitions the graph using a heuristic method creating clusters of similar phrases (e.g. PAL AROUND WITH TERRORISTS, PALLING AROUND WITH TERRORISTS), the authors perform a quantitative analysis of the occurrence of different phrase clusters found in the news. Diagramming their occurrence shows the characteristic wax and wane of topics dominating the news cycle.

They find that the global dynamics of the news cycle is explained by (1) imitation or topic diffusion between sources (news sites, blogs, etc.) and (2) a bias against older topics. At a local level the authors discover a 2.5 hour lag between mainstream media and blogs for individual topic threads, with mainstream media's topical phrase volume increasing slowly, peaking, and then dropping off rapidly, while blogs show a rapid increase, peaking slightly later than the mainstream media, and then tailing off slowly.


It is interesting to place Leskovec et al's work in the context of Sperber's epidemiology of representations. Sperber identifies three types of causal chains particularly relevant to social science: (1)Cognitive Causal Chain (CCC), in which some semantic relationship is instantiated by each causal link, (2) Social Cognitive Causal Chain (Social CCC), in which a CCC is distributed across individuals, and (3) Cultural Cognitive Causal Chain (CCCC), in which "a Social CCC... stabilises mental representations and public productions in a population and its environment." (Sperber 2001).

In this scheme how does one regard the relatively brief cascades of public productions tracked by Leskovec et al.? The public productions are only the visible parts of social or cultural causal chains. They would seem to occur against a background of low-frequency but presumably stable Cultural CCCs. Given the relative nature of information and meaning, one wonders just what phrase clusters make up that context, and how they contextualize the transitory public representations identified by Leskovec et al. Indeed, to what extent do the transitory loci of interest define a set of sign-posts for the construction of larger cultural narratives? Culture as not merely experienced, but as generated, reformulated, and used as a resource for situated action. Finally, one hopes that these kinds of research efforts may at some point observe in detail the genesis, spread, stabilization of long term elements of cultural life.


Leskovec, Jure, Lars Backstrom, and Jon Kleinberg. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 497-506. Paris, France: ACM, 2009. (accessed October 16, 2009).

Dan Sperber. 2001. Conceptual tools for a natural science of society and culture (Radcliffe-Brown Lecture in Social Anthopology 1999). In Proceedings of the British Academy, 111, 297-317.

Views: 17


You need to be a member of Open Anthropology Cooperative to add comments!

Comment by Jacob Lee on December 19, 2009 at 9:04pm
I'll just add that computational linguistics and especially its sub-disciplinecomputational semantics are rapidly maturing fields that have tackled issues like polysemy in document textual analysis.
Comment by Jacob Lee on December 19, 2009 at 9:40am
Indeed, you make a very good point. Form seems to persist, while meaning shifts. Perhaps this is to be expected, because linguistic utterances are directly accessible to perception while meaning is not. A common code is a basic precondition for communication. Meaning meanwhile must be inferred by one's attunement to the context in which utterances are made.

The authors used phrases (various related utterances) as a sort of proxy for topics. For example, lipstick on a pig had a particular topical significance during the 2008 election. But this phrase is not about the election or the circumstances of its utterance during that election. Future utterances of this phrase may or may not correspond to the same topic. Clearly then the tracking of the occurrence of various related utterances as the authors have done may not sufficient to track topics over the long term,. Some kind of tracking of context is necessary, at a minimum. For example, the tracking of a larger constellation of phrases as they co-occur would seem to go a long way fixing this- if in the same part of a document the phrases 'Barack Obama' 'Sarah Palin' and 'lipstick on a pig' co-occur then I think we can be pretty confident that the topic is an event of the 2008 election. But, as your example of ti and yong shows, co-occurrence is not enough to guarantee stability of associated meanings.

I might add that it may be of interest anyway to track the long term use of particular phrases despite any changes in meaning.
Comment by John McCreery on December 18, 2009 at 8:19am
Fascinating stuff. One fly in the ointment may be that over sufficiently long terms the significance of phrases may alter due to changes in context. This point is nicely illustrated by Joseph Levenson's discussion of the word-pair ti-yong in his trilogy Confucian China and Its Modern Fate. In the 18th century, ti and yong could be translated as "substance" and "function," conceived as complementary aspects of an intact and fully integrated cosmology. By the early 20th century, that cosmology was collapsing. The same pair of terms had come to imply "Chinese essence" and "Western technology," adoption of the latter being seen as essential for preservation of the former. The terms still appeared together in the same grammatical slots in classical Chinese texts; but their meaning had been transformed.


OAC Press



© 2020   Created by Keith Hart.   Powered by

Badges  |  Report an Issue  |  Terms of Service