Son of Smart Tags

The debate over Smart Tags has errupted again. Microsoft was forced to pull the feature 4 years ago, but Google has stolen the idea and is running with it. Dave, Rogers, and othersare screaming bloody murder. Four years ago, I argued that the idea was impossible to stop, and that Google was a likely candidate to pick it up.

Essentially, I see this as a free speech issue, and I reject the derivative works arguments. And from my experience from four years ago, I don’t believe that the critics would be satisfied if the implementation were tweaked (if necessary) to pass any legal challenges based on derivative works laws. The critics want to control who talks about their content; or at least how people talk about their content. This is a losing battle.

Imagine for a moment that it is 1983, and you are a busy executive who wants to follow 20 different trade journals but doesn’t have the time. You hire an assistant and ask her to prepare each journal for you before it lands on your desk. Next to each article in the TOC, she is to scrawl a 1-5 rating of how relevant she thinks the article will be. Articles which are deemed useless are ripped out of the magazine. Post-Its are used liberally to ?tag? pages with commentary (includingscathing comments about particular authors)or cross-reference to other magazines. In each journal, she is to underline the first mention of any competitor firm, and insert a printout of the company’s most recent financial statement.

Imagine how productive you would be. Now imagine that one of the magazine publishers attempts to sue your assistant for profiting from a ?derivative work?. You would probably pity the publisher. So why should we expect this to be illegal on the WWW in 2005 when it was legal with magazines in 1983?

I honestly understand where the critics are coming from, but as long as the user consciously chooses to have her content prepared and annotated for her, it seems like a big mistake to attempt to prevent or encumber that. Trying to stifle ?Renmin Voice? is never a winner.

And I would just remind people that the original Smart Tags architecture was completely open, so every user could choose which tagging authorities to subscribe and trust. If you think that Google is a step forward in this regard, you don’t know Google. Let’s take bets on how long it will take the industry to dig out of this hole and eventually get back to the same open architecture that was shot down in 2001.

~

Update: I just found some e-mails from December of 2000 where I was encouraging Jeff Reynar (then PM ofSmart Tags in Office)to make Smart Tags more RSS and web-friendly. I was suprised to see that he has moved to Google (and was alleged to be behind the feature that looks so similar to Smart Tags). From what I hear, he may have had almost no involvement in thefeature at Google. In any case, I wonder why Google didn’t just license the code from Microsoft, and get the same Smart Tags architecture that already works in Office (and worked in the browser until we were forced to yank it)? This would be more consistent for users, and would share the same open architecture. And the code was already written (both very likely use the same BHO hooks). It’s hard to imagine that they didn’t think of this.

Update: Before Evan was acquired by Google, he agreed with Dave about Smart Tags. I wonder how soon before Google starts purging search results for prior art?

Update: Scoble already talked about this, but he’s missing an important point. He says that Google can ship Smart Tags because Google is not a monopoly like Microsoft. By this logic, we would be prevented from shipping Smart Tags for Office, but we would be permitted to ship Smart Tags for MSN Search, where we are the underdog and Google has a monopoly. But interestingly enough, we were able to ship Smart Tags for Office, but not for MSN Search. This is exactly backwards to me. Given our underdog status in the search space, I see no reason not to ship the Smart Tags in MSN now.

Final Update: Dave has now published a more formulated opinion on this. He points out exactly what I said above — there is no excuse now for MSN to not follow suit. As long as the implementations are opt-in, I think any political challenges are DOA. I agree with Dave that this is a bad thing, but for different reasons. I disagree with Dave’s assumption that it would be possible to stop users from adopting this feature — even if you could shut down Google and MSFT both, the feature would ship by someone, and would be adopted, and eventually we would all be forced to follow suit. The real problem is that we let this turn into an arms race, and we’re going to end up with separate walled gardens with incompatible standards. A content provider (such as Amazon) who wishes to annotate users pages should be able to implement it once, using open standards, and have it work in Google, MSN, or Yahoo. Users should be able to chose whichever content providers they want, and it should be totally open. If the independent developers had adopted this scenario rather than fighting it, and followed the model of RSS (get broad adoption on open standards before the BigCos have a chance to get locked into competing standards), we might not be in this situation today.

Lisp vs. Smalltalk

Don Box is considering what programming language to teach to best round-out his children. When teaching programming as a way to balance a child, rather than to accomplish specific programming tasks, it’s interesting to think about the thought patterns that the language encourages or enforces.

Languages like Basic, C, and Assembler always bothered me, because they reward stubborn dogged determination and muddling along rather than clear thinking. They are very direct a naive — you just keep telling the computer what to do, until it actually works. Lisp and Smalltalk both have purer philosophies. So what tells them apart?

The answer, I think, is that Lisp encourages ?right thinking?, while Smalltalk encourages ?right classification?. It’s not Prolog, but Lisp forces you to think in structure, in trees, grammar, and in stacks of reference which you can evaluate to be ?correct?. Most kids are very weak in grammar wetware at a young age, because they’re exposed to so much crap grammar in the media, and grammar schooling is not like it used to be — so a symboic language that exercises these parts of the brain is valuable. Smalltalk, on the other hand, encourages people to think in terms of definitions, building structures that would make Linnaeus proud. Of course, both classification and grammar are inexplicaby intertwined and both are critical parts of our thinking ever since Plato and Aristotle. But I think kids need to learn ?right thinking? first, and it’s important to burn this into the brain while it’s developing. Right thinking is the foundation for much of what we learn later. Good classification philosophycan be learned at any time.

iPod Transitions

Dave just went through the distress of losing his iPod, finding it, and apparently losing it again. I know how it feels. This Friday, I dropped my iPod while it was running, and it does nothing now but click and whir. Luckily all of my music, including several hundred dollars of purchased songs and books, was on the PC. I bought a new iPod yesterday and filled it back up last night. This time I got a mini like my wife has, rather than the 40GB. While it was nice to have my entire music collection on the iPod, I found that I never need more than 1000 songs available at a time anyway, and the mini just transports better.

I noted that both the Bellevue and Seattle Apple stores had only a few minis in colors other than pink. It seems that pink minis are a very popular gift for guys to buy for their girlfriends.

What’s up with sxip?

So all of the smart people say that sxip is the ultimate single-signon system. The docs look nice enough. But how am I supposed to use it? I can’t find a single ?HomeServer? listed on the site. If I have to implement my own server to get logins, it’s not very SSO, is it?

Additionally, I’m having trouble seeing what prevents them from charging me fees after I implement a server, take a dependency on their protocol, and get locked in. I thought it was an open, public domain SSO network — but I can’t find evidence to support this wishful belief.

Ontological Hell

Marc is talking about ways to organize ?tags?. Organizing tags in a hierarchy of namespaces (or filtering with manually managed social networks) is just another way of proposing ontological hell. A middle ground between wide open tagging and namespaces would be for tags with ambiguous definition to cite the particular definition in the dictionary to which they refer (?jump; def 2?). WordNet makes this easy; you can reference a common English word generally, or point to a specific ?sense? of that word. But even this level of precision is too cumbersome to see widespread use.

I think the real solution to the grouping/hierarchy problems is to have the computer do the clustering and mining automatically. Since Aristotle, every human attempt at creating hierarchies of terms has been nothing more than an arbitrary clustering exercise anyway (things with fur are different than things with scales, etc.) Computers do this much better than people; especially for data which is already native to the computer. Tags should be able to be clustered based on a combination of person using the tag, content being accessed, other uses of that tag, and so on.

My RDF Litmus Test

Here is my litmus test of RDF. If someone feels that rdf:type is optional, and at best a hint, then that person ?gets it?, and has the same view of the ?semantic web? vast potential as I do. If that person feels that rdf:type is critical, then they are clearly locked in 1980s thinking and are biased toward a nightmare of ontological mapping problems.

ACLU Stomping Free Speech?

The ACLU is trying to get you to panic about free speech. Attacking the government and corporations is fine, but once every single individual in the country becomes a pantechnicon-toting pal recording everything in their personal experience willing to share observations with anyone else, it’s going to be impossible to prevent the pizza scenario from manifesting.

As it stands, the scenario presented in the ACLU scare video happens already. Someone sees you buying something at the drugstore, who mentions it to someone else, who mentions it to someone else. The only remedy is to outlaw gossip, or even outlaw people from communicating anything they see.

Or maybe the ACLU is upset about the proliferation of technology. Gossiping about stuff you saw with your own eyes is OK, but not things that you snapped with your digital camera. So where do you draw the line? Maybe we should outlaw gossip from people who were wearing glasses at the time of the incident in question, since eye glasses are artificial technological enhancement. Or maybe the crucial distinction is in method of communication: voice or writing is OK, but not photograph. Maybe drawings should be illegal if done in anything other than watercolor.

Cheap Metadata

For years, we have heard otherwise intelligent people carp about how ?metadata will never work?. But history has marched on, showing example after example of sucecssful applications of metadata. The nabobs have been forced to continually tweak their embarassing position, now warning us all that ?application of metadata to problem X using technique Y will almost certainly fail?.

Dare recently revisted the arguments of one of these inexplicablyrevered metadata haters. Now Clay revisists the metadata-antagonism of the otherwise venerable Tim Bray. I look forward to the day when everyone thinks as clearly as Dare and Clay, and I’ll be able to wind down my years-long campaign against all haters of metadata, which has sapped me of so much family time.

In any case,I’m not quite as optimistic about tagging as Clay is, and he has a maddening habit of over-classifying and over-categorizing things (?the characterteristics of cheap metadata are… take notes, class…?) But he blasts apart the fallacy of ?there is no cheap metadata?.

We are swimming in metadata. It’s everywhere. Saying that ?there is no cheap metadata? is even more incorrect than saying ?there is no cheap opinion?. Opinions come pretty darn cheap, and metadata is just a way of sharing opinions. And with metadata, your car can share an opinion about the road conditions, with no effort at all on your part.

And Claynails the most imortant point — when the value of the metadata grows with aggregation (or sharing in general), then the perceived cost of producing it becomes much less important. Considering that we live in a time and place where we can enjoy a cheap meal that includes food from all corners of the globe and involving a supply chain of thousands of human laborors, it’s bewildering that anyone would focus on the expense of the supply chain anyway. Producing and distributing food is infinitely more expensive than producing metadata, and the potential gains to humanity from advances in metadata distribution are comparable to the gains we got by switching from hunter-gatherer to agriculture.

~

Recent history has amply proved that there is healthy demand curve for good metadata, and there is certainly a huge supply potential. This is how it always has been through human evolution — the supply and demand curves for human communication have always been very strong. The people who focus on supply or demand are ignoring the human condition since Babel. The problem has always been in the distribution network. Sharing your opinions with other humans is similar to loading your back with furs to hike through the snow to the trading post. With modern telecommunications, we are more capable than ever of sharing our assertions with one another, but we are still hopelessly primitive compared to where we could be. When I want someone’s insight on a particular topic, I still have to know who to ask, or how to find out, and my research leads me down many disjoint paths. Until humans can publish their assertions transparently to a single ?cloud?, and all hum ans have access to that cloud, there will still be huge value to be gained from improvements to the distribution network. And that’s not a mtter of supply and demand — it’s a matter of time. The supply and demand is there.