Flocks, Herds, and Stories

temporal coherence and the long tail

Mark Bernstein

Eastgate Systems, Inc.

© Copyright 2011 Eastgate Systems Inc. All Rights Reserved (but ask!)

click anywhere

I had intended here to offer some
Explanations and apologies
But there's no time. Prepare to shed some tears
And hang on tight. 

“There are many ways in which we could ruin the Web.”

Wendy Hall
(Hypertext 2011)

Last week at Hypertext, Dame Wendy Hall
Reminded us of what we might forget:
The Web is large and new, it flourishes,
It seems to go from strength to strength, and yet
We do not know how strong it really is.
We must remember that we still could wreck the web.

The Death Of Surfing

“Web-surfing is dead. Sure, users may check out a few new sites every now and then...most users will probably spend the majority of their time with a small number of websites that meet their requirements.”

Jacob Nielsen
(Jan 1996)

A cheerful Jakob Nielsen once forecast 
That the web's early froth would soon subside
And leave us with a few large sites that would
Provide the stuff that common readers want,
Leaving the failed and unsuccessul sites
Unfrequented, unvisited, to wither on the vine.

Integrating the long tail

The long tail remains viable if its integral is large.

If the integral becomes small, larger interests and governments will eventually discard it.

The key is that the long tail remain long.

This has not happened yet, and the long tail
Still seems to flourish. Blogging, to be sure
Is not quite what it was, but Twitter is,
And Facebook seems to make a lot for Zynga
And someday might for us. Besides, we still 
Have lots of blogs and lots of other sites,
Folkloric or caloric, scholarly or fun.
They're doing fine for now, it seems, and so
What should we fear? 

As We May Have Thought?

One lesson of the short 20th century: crowds are not always smart.

Lesson of 2011: the net does not route around power.

	I want to argue here
That beyond familiar hazards lies one more:
The tricky danger that mere traffic noise –
Seemingly harmless, familiar, not a threat –
Presents when it confronts the zero lower bound
Without a viable recovery scheme.

Web traffic is noisy

We all know that Web traffic is very noisy, especially for low traffic sites.

Trained to be a chemist, I was taught
To first look for signal and, that found,
To always check the noise. For if no noise appears
It's likely that your signal is unsound.

Finding some noise, you ought give some thought
To measuring its size from trough to spike.
We all know well that server loads will veer
From high to low, and back, from day to day.

It’s always noisy

Traffic fluctuates all the time.

Sometimes we think we know why.

From hour to hour, alike from week to week.
We can explain it, just as the news
Can always tell us what the market thinks:
"Stocks moved down today on fears of fresh
Inflation. Tech stocks gained on un-
Employment news."  But my experiences is
There's always something happening, and the noise
Is never really easy to explain.

What we’d expect

We do expect some noise because you can't
Have half a visit. Readers are discrete
Like cars upon the road. At the high-traffic bound
This doesn't matter much. But at the low
Each choice turns out to matter that much more.


Speed: 2 Spacing: 0 Prob.: 0.05
Poisson first studied this, and the key thing
To know is the expected variance 
Is just the mean; so if the mean is N
The variance is N as well, and so:
Look at our logs. We observe -- especially 
In the tail -- a lot more noise than this.

Not Poisson

Speed: 2 Spacing: 10 Prob.: 0.05
Poisson assumes that no one interacts.
But if we interact the noise may change.
If our cars hit the brake in traffic, we
First find that clumps and traffic jams
Increase the noise. 

The noise can go down, too

Speed: 2 Spacing: 10 Prob.: 0.5
The noise can go down, too,
As here, where traffic is so dense 
That by the time they reach our sampling zone
The cars have all assumed a common speed and spacing.

Independent Browsing

Count: 100 Follow: Avoid: 4 Stop: 1000
The same result is found for better models.
Here independent browsers move through space
Indifferent to what other browsers do.  

Flocks Browse Together

Count: 100 Follow: Avoid: 4 Stop: 1000
But here, instead, these readers flock together,
Following their whim unless they see a friend
Nearby, but clinging to their closest friend
If any wanders past.

	We may distinguish here
The HERD, in which a pundit does decide
Where everyone should go, from what I here
Propose, a simple FLOCK, where no one is in charge
Yet nonetheless these organized behaviors do emerge.

Flocks Browse Together

Count: 100 Follow: Avoid: 4 Stop: 1000
Temporal correlation boosts the noise as when
A classroom full of students visits you today
Because a visit to your website was assigned. 
They won't be back today. A year from now
The next year's class may visit you again. 

Narrative Drive

Some of you may know that for quite a few years
I've worked as a publisher of hypertext fiction.
We  once were the darling of postmodern critics,
And later the –  something  – despised by their rivals.

I mention this story not just for your sympathy
(Though that's always welcome) but rather because
I want to distinguish the high modern fiction
We publish at Eastgate from broader concerns
For narrative that I've expressed in the paper.

Narrative Drive

People like stories, we all want to know
What happens next. We'll tune in tomorrow
To learn how things went, to hear of our friends –
Even our friends whom we don't really know,
Even our friends who don't really exist:
Especially our friends who may not be real.

Fewer Actors: More Noise

We visit tomorrow to see how things went,
Perhaps we might mention the case to some friends,
Or write a short note in our weblog about it.
Either way, herd or flock, stories focus the web.

A Test-Tube Blogosphere

A simple test-tube blogosphere  
Will quickly illustrate
The dangers our sites face when they 
Confront the lower bound.

To start, we have some sites. 
Each has N outbound links
To other sites they like or use 
For regular updates.

Each day, each writer chooses 
A few links to pursue
And Markovly they follow up 
To see what might be new.

Sometimes, a site discovered  
Is added to its list.
And we might sometimes take a look 
At sites that links to us.

Few Links, no referrer logs

Count: 50 Links per site: 3 Reading list: 4 Logs: 0
When links are sparse and logs ignored, it's true
That nearly all the traffic goes to some successful sites.
The others publish links to what they read 
	– just like the rest;
The sites that still have traffic are in red, the rest are blue.

The zero lower bound

A site that has no links no longer can be found,
And so, quite soon what once was our long tail
Decays to form a grim but stable web
In just the way that Nielsen once foretold.

Few links, referrer logs

Count: 50 Links per site: 3 Reading list: 4 Logs: 0.05
Static, dreary, dull and dead: our tail
Is now no longer long. What can we do
To shake things up? Our bloggers might pursue
Some inbound links discovered in their logs.

Discovery helps

Googling one's self would also do the trick,
Or keyword search, or even buying ads.
The same grim logic holds: our tail again grows short. 
But now a site, though blue, can rise again 
To shine in splendid redness for a time. 

If links are sparse, even the lucky rich
May fall from grace and hear the baying hound,
That grim, unlucky reaper: the zero lower bound.

More links, no referrer logs

Count: 50 Links per site: 10 Reading list: 4 Logs: 0
Add more links, we're better off.
This observation is not new:
Its why we study hypertext.

Mindless link farms dont help much,
And simply linking’s not enough
Since if we hit the zero lower bound
Our site turns blue, and our mood blues too.

More links, no referrer logs 2

Count: 100 Links per site: 20 Reading list: 4 Logs: 0
The hope here is, add links enough
And readers too: you still might lose
The longest part of the long tail.....

More Links

Much better!

The zero lower bound still looms.

...But still retain a vibrant "middle class"
Of many sites too busy to endure
That fatal time of loss, but which
Need not consolidate – or anyway
That won't collapse right now.

More links and referrer logs

Count: 50 Links per site: 10 Reading list: 4 Logs: 0.1
As you've forseen, if we provide more links
And use our logs to rediscover sites
About lost love, the plate that time forgot,
Or synthesis of octatetraenes:
Whatever floats your boat: as you expect
The genre tropes that shape this talk compell
Our problems are resolved – and all is well.


So: we need lots of links, and backlinks too,
if we are not to wreck that fragile Web.
And have we links enough? Is our familiar Web
That final, happy Web that we just saw? 
Or have we launched downhill? How can we tell
What we've already lost?

	My second point: those links
Shape stories, expectations that – when violated 
   as I'm doing right now – drive our readers mad.
Fiction and rhetoric are not artistic toys:
They are the raw material with which Web Science works.


Thank you!