Zhongguo Renmin Car

In less than two weeks, German leader Gerhard Schroeder will be with Chinese leader Zhu Rongji, on-board the maiden voyage of the new Shanghai magnalev. People’s Daily is reporting that Chinese business is propping up the German economy right now. From what I saw while visiting Beijing in September, this is probably true.

Havingbeen raised in thehome of”the big three“, I pay attention to automobiles. Beijing has a lot of cars; the traffic situation (and I was in the traffic constantly) makes Boston or Manhattanlook like a child’s game. The first thing you notice about the cars is that Chinese brands dominate all others. Names like Songhuajiang, Changhe, Jiefang, Dongfeng, Xiali, and Santana. I asked around, and found that Xiali was a joint venture with Toyota but is now based in Tianjing and majority controlled by China. Santana is the only “Chinese” brand not controlled by China, and is a joint venture with Volkswagen. I would estimate that the brands listed above make up for at least 85% of the vehicles on the road at any time.

Even more surprising, most of the remaining cars are not Japanese or Korean. Based on geographical proximity and expertise,one would expect the Japanese and Korean auto makers to be swamping the market. Instead, the clear leader by far for foreign-brand cars in Beijing is Volkswagen, with other German automakers like Volvo and Audi also selling some cars. American, Japanese, and Korean brands combined sell practically nothing.

It became a mission of mine to discover why only Germans seemed to be able to crack the Chinese auto market, but I only found more mysteries. I saw only a few auto showrooms in the city; huge, glitzy, multi-story showrooms with shiny cars positioned at strategic intersections. They were all American brands. Why is it that the best chance of seeing an American car in Beijing is to visit a showroom? If these other brands don’t even have a dealership, where the heck are people buying all of these cars?

So I started asking Beijing residentswho owned cars (mostly German cars)to describe their car-buying experiences. I found that car-buying, like everything else in China, involves knowing the right people and a lot of hustle. And car-buying definitely does not involve a visit to the glitzy showroom. I found that the Chinese brands sell their cars from huge lots on the outside of the city. And I found that you can even estimate how much someone paid for their car by looking at where the plates were issued (Guangzhou is a good place to buy cars). But most of all I found that Volkswagen has a ton of credibility for being willing to play by Chinese rules.

This was the story I heard often (in relation to software as often as automotive). Chinese people, I was told, are sick of foreign companies coming in, exploiting the Chinese people, and then leaving with the loot. Other companies just try to siphon off as much as possible while leaving Chinese industry hollow, but Volkswagen is willing to work as equal peers with the local manufacturing outfits. Volkswagen was willing to recycle old models for the Chinese market and sell at Chinese prices, Volkswagen was willing to play the same hustle that Chinese manufacturers play, and so on. In contrast, the people I talked to seemed to think of the Japanese automakersas arrogant and exploitative. Since Beijing seems to love Japanese department stores, and Japanese restaurants were really popular, I doubt that the perception of Japanese auto makers was indicative of any sort of anti-Japan sentiment.

The way I see it, the Germans are just doing a much better job of doing business with China, because they have taken the time to understand the market. (Not that I understand the market very well, mind you, but all of the locals tell me that Germans do, and even I amsmart enough to see that the big glitzy American showrooms aren’t working.)

Export OPML from RSS Bandit

Dave Winer is collecting information about importing and exporting feeds from various aggregators. To get a feed list from RSS Bandit, simply select File|Export Feeds as shown below. A file dialog box will appear; just make sure you choose OPML as the file type from the dialog box. There is also a File|Import Feeds menu entry that you can use to import feeds from any of the other aggregators that Dave has listed.

Naked XML

If you follow Tim Ewald’s blog, you’ll know that he is religious about XML and run-time typing. If Dare Obasanjo is the Zen priest of XML, Tim Ewald is the Pentecostal evangelist(I am, naturally, an XML Zionist — Software is Microsoft’s birthright, and XML her manifest destiny :-)).

Tim is a harbinger, IMO, of where the whole Object-Orientedcommunity is moving. And he is proof that XML is the great peacemaker, stimulating proponents of competing data models and programming models to learn from one another. And the lesson that Object-Oriented (and related)programmers are learning from XML is that semistructured data and programming are beautiful.

Now, professed reverence toward XML is not proof that someone has apprehended the true beauty of semistructured data model. I have a litmus test of sorts that I use to determine if someone has “got it”. I show them an XPath like “//contact[.//fax]” and watch their faces. Of the people who understand what it does, most will have no reaction, and most of the rest (the experts) will raise their brows skeptically and say “only a stupid person would write such an inefficient query!”. There are yet precious few who exclaim “that is how things should be!” as their faces light up.

The lesson, of course, is that real-world information is chaotic. In any but the smallest “proof of concept” systems, the best that one can hope for is to be able torecognize small pockets of structure within a sea of otherwise unstructured information. People in the VLDB, data warehousing, and ETL communities have long realized that it is folly to tightly bind data tuples and relationships into restrictive schemas. Boyce-Codd normalization rules maintained flexibility bypreserving a distinctionbetween relations and tuples, but even these rules are toobinding for many VLDB applications.

But while flexibility is important for complex systems, complete lack of semantics is useless. The real goal is somewhere between strong-typed and untyped — to provide structure when and where you need it, while protecting your right to ignore the rest. The first paper that clarified this idea for me was Peter Buneman’s discussion of dynamic typing for semistructured data.

The last decade has seen a great deal of research into semi-structured data access, some of it quite pragmatic and immediately useful in real-world data management problems (e.g. Florescu). Simultaneously, others researched programming models based on semistructured data (e.g. Meijer). Researchers like Meijer and Florescu (and the rest of thedominant diaspora of researchersoriginally fromUniversity of Bucharest) did not start with XML, to be sure, but they quickly recognized XML as the first semistructured data format with the power to go mainstream.

On the other hand, XML has been pulled in many directions from the start, and has failed to provide a clean and consistentdata model. The nattering data model issues have severely slowed adoption of XML for use in semistructured data access and even object serialization. So while XML has become mainstream as an easy-to-parse text format for interop scenarios, theprogramming and data access models have not really been able to take advantage of this trend in the ways that Tim Ewald envisions.

But, there is reason for optimism. The industry appears to be coalescing around a common understanding of what the XML data model is. My Grandmother will probably never care about this, but to me it makes the future seem rosier.


Blockbuster’s video rentals are way down, and use of high-speed Internet is way up. Do you think the two news items are related?

It’s interesting to see how storage is evolving, though. I find myself wondering why people would pay $1/GB of DVD+RW media and another $400 for the burner when they can just pay $400 for 250GB of 5400rpm external storage. That’s enough storage for nearly 500 full-length movies. Four of those things stack to hold 1TB in the size of a Nintendo Gamecube. Akashic Records, here we come.

Too bad all of the video screens and speaker systems on the market todaystill depend on wires. I would love to just buy a 1TB datacube and stick it in a closet somewhere, and never have to worry about CDs or DVDs again.

Google finds “understooding” less than 50 times, but “amn’t” more than 1500.

Location Services

At least Google is moving closer towards owning the semantic web, and nobody is fussing. They already have a web service interface, and webquotes allows what is essentiallyannotation metadata about a resource. And assuming that Sergey was not leading Dave on at the conference last week, they are gung-ho about allowing people to update metadata directly into Google. Am I the only person who is grasping the full potential of this?

Here is something to think about: if you could “push” your web pages to Google to be indexed, and Google already caches those pages for access,why would you even have aweb server? If you publish to Google’s cloud, you get automatic indexing, metadata like who is linking to you, and more. And Google can add little semantic web-like features such as webquotes every few months to keep you hooked. Then, the advantages of a central index really kick in when metadata starts to explode. Obviously Google isn’t pushing the “we made a better Internet” angle yet, but they could — and the fact that they are so carefully surrounding key strategicbits of territory is not a coincidence. I think AOL and MSFT both blew it already, and the Google guys are not as “aww, shucks, we just like to write web crawler software” as they talk. Game over; the tired old Internet can’t compete.

I wonder why nobody is publicly speculating yet about why Microsoft seems to be so interested in location services?

Kill Kurds, Not Mumia!

I saw Eight Mile a few weekends ago, but I haven’t bothered to put up a review. I thought it was great, but I am pretty biased. The footage was all real, and most of the film was shot in areas that hold lots of good memories for me. I thought it was cool that WJLB’sBushman even played himself in the movie (scroll to the bottom for the chubby-cheeked guyin Coogi). Only thing that could have made it better for me would have been a cameo by the “Electrifying Mojo” (who was steering his mothership away from WJLB toa competitor during the movie’s timeline).

The movie poked fun at a lot of things. The scene with the protagonist burning down an abandoned house was a good joke. Presumably, he and his friends burned the house down to protect little kids from bad people. Now we know why all of those abandoned houses in Detroit get torched every year — how noble! Same thing with the melodramatic scene where the protagonist discovers that he can use self-deprecating rhymes to avoid getting bullied by a kid from Cranbrook (Scott McNealy’s alma mater, incidentally). Now we know why Eminem talks trash — poor guy had to suffer through all of that personal agony! And one wonders if Eminemat some point in the past made a bet with friends that”I can get Kim Bassinger to say *bleep* *bleep* in a movie”.

The other cute thing was their unorthodox choice of “comic relief” characters. It’s not exactly P.C. to star a Polish kid who is always doing stupid things. Even less P.C. was their portrayal of thesterotypical “funky homosapien”guy with nappy dreds who looks like he has just come from some humanitarian eventandwon a spoken word contest after an advancescreening of “Love Jones” to benefit Mumia‘s legal defense fund. Rather than portray the guy like the real B-Boy hero he obviously was, the D12 used him as the butt of their jokes.

The movie closes with the protagonist raising his middle finger at the camera, bored. It sums up the feelings of an entire generation. That generation grew up suffocated by stories of protest and revolution from a generation of baby boomers who took themselves way too seriously. While white kids were forced to listen to stories about how the shop teacher saved the free world by reading Marx and mixing with Abbie Hoffman at anti-Vietnam protests, black kids got constantlyheld to the example of their uncle Archie who marched against oppression with brotha Malcom and founded the local Muslim organization (he does good things for his community).

Of course, today’s establishment is yesterday’s anti-establishment; and the media today are dominated by this self-righteous generation whotake themselves very seriouslyand mostly congratulate themselves for whatever it is that they think they accomplished in their counterculture revolutions. The debate between left and right in the media tends to be a bullfight between the thirty-years-old sacred cows of either side, with no room for perspective that violates either group’s rememberence of their glorious youthful past. So I was quite surprised to see the piece in WSJ today titled “Kill Kurds, Not Mumia“.

The story is about a Seattleite who went looking for ruckus last week at the antiwar protests here. He basically grabbed a stick and went looking for eyes, and the result is really funny. The reaction from the veteran protester is the funniest, “I don’t much care for your generation. You’ve got the message all wrong. This is all so stupid.” You can’t find better poetry in a Beckett play. And maybe, just maybe, it shows that the WSJ is starting to get a clue about what it is that attracts Gen Y to people like Tom Green and Eminem.

Unfettered Access for Terrorists?

Brad Wilson is one of the many people “providing comfort” to the enemies of freedom. Just for the record, I believe I was the first to point out the potential problems with open hotspots. My original post from approximately six months ago explains some of the reasons why I think it is a legitimate issue for the feds to be concerned about.

For what it’s worth, I think that “open” hotspots are only the tip of the iceberg here. Even if the feds succeed in eliminating open hotspots, that will keep out normal dabblers only. Any fifteen year-old hacker worth the name will have passwords to numerous local hotspots, and getting ahold of that information will be exactly as hard for criminals as getting ahold of stolen cellular phone information (in other words, not hard at all). It won’t be long until every sidewalk in America is bleeding wireless access, and not a chance that the industry will have appropriate safeguards (PKI infrastructure based on DNA sample maybe?) in place.

Perhaps the best way to reduce the threat of wireless hotspotswould befor the feds to latch onto some of the potential scenarios I predicted in another post. Especially the idea of “ad jammers”. When I mentioned “ad jammers”, I was talking about cheap transmitters that would fill up vacant air and lure people like Doc Searls to attempt connecting.

But if you are an advertiser, isn’t it better to just go directly where the people are, rather than wait for people to come past your transmitter? If you want to go to where the people are, what better place than the nearest public hotspot? If you are lucky, lots of people will already be using it, and the warchalkers will have given you good directions about where to go to put up your own transmitter. The people using this hotspot will likely be cost-conscious and savvy, so they will be eager to hear about your offering. Simply put out a stronger signal with the same SSID, and inject yourself as a proxy with advertisements. Soon everyone will be flooding you with new orders for your product or service. (If not, you should probably be advertising terrorist supplies instead).

All this plan would require form the feds is to make the hardware for doing this inexpensive, and stall like crazy on any legislation that makes it illegal to thus hijack hotspots. I bet that would wipe out the wireless bleed in no time.

(For the humour-impaired, there are at least three sentences of sarcasm in the above post)


During the last 7 days in St. Louis, I carefully avoided anything work-related and instead played around with some ideas I have been having about “semantic web”. I wrote a little “RDF Triple Store” and put hooks into Outlook and Internet Explorer to update the store automatically with various interestingbits of metadataas I do my daily activities. I first wrote the hooks in plain C++, but rewrote them to use C# for everything (except a few linesforthe Browser Helper Object shim). I had some trouble moving from C++ to C# for the Outlook events until I discovered that there is a new Primary Interop Assembly available for Outlook. But now that it’s done, I am more sold than ever on C#. The code is so much cleaner — so much more obvious. The data store itself is implemented using ADO.NET and the OleDb providers, although I have no idea why I did that. It’s not as if I really want people to use Jet or Oracle as an underlying store anyway.

Already the storeis full of interesting stuff and I have been having fun writing queries (runs on top of SQL Server)against the triples, and loading into Prolog to play with the data. One of the next things I intend to do is hook the storage routines into SQL Server’s built-in data mining services, so that basket analysis and decision trees are calculated automatically for interesting information.

Seeing how easy this has been to prototype has made me even more convinced that we are reaching a critical convergence in the industry.Looking at the various projects sprouting all over the web, I think we are already past the point of no return – metadata is going to explode like a nuclear bomb, and it’s too late for anyone to try to stop the chain reaction.

Of course, some projects seem to “get it” more than others. Until just moments ago, I had not paid much attention to the rumors of OSAF using RDF. I always thought that PIMdata was a particularly appropriate type of information to store using a KR syntax such as RDF, but I have assumed that OSAF would screw it up. After all, most uses of RDF I have seen to date seem to follow a common pattern:

  1. Project needs to store some data
  2. Project architects want flexibility, and someone looks at RDF
  3. Project architects discover that RDF has some degree of elitist appeal and can attract attention
  4. Project architects forget to consider whether other data models, such as XML, relational, or objects would be more suited to their particular task
  5. Project wastes tons of effort trying to shoehorn their relational or XML data into an RDF-shaped box. Architectsultimatelyblame the users for producing crappy data that doesn’t fit RDF.
  6. When the project fails they complain that “our project failed because people areignorant primates who disrespect what they do not understand.”

On the other hand, I think that RDF has (finally)been shaping up to be a good replacement for the scenarios where someone would normally use a format like KIF, and the WebOnt language is consise, readable, and has tons of potential for a broad range of “knowledge interchange” scenarios.

Until a short while ago, however, I felt secure in the assumption that Kapor and crew didn’t “get it”.Smug until I browsed over to David McCusker’s blog for the first time in a month, and realized that he has recently been hired by OSAF. The gong sounds as Bruce Lee walks into the room…

David’s interest in semantic storage seems to be a full-time obsession, manifested in his coding onIronDoc for the past few years (as long as I haveinteracted withhim). He has been writing and dogfooding flexible “semantic” storage for all of this time (and all the time), despite having a day job, because he can’t stop himself. He “gets” the value of RDF to OSAF, and he has lots of experience and motivation implementing and using the sort of flexible KR-oriented database that is so usefulfor OSAFapps right now. Now I think that OSAF might after all produce some cool stuff, and I’m actually excited about the possibilities.

That Can’t be Smart

Andrew Orlowski at The Reg is reporting on the latest plan to “resist” the Information Awareness Office. The plan comes from a rather activist-sounding guy at SF Weekly, and basically boils down to please personally harass John Poindexter, intimidate his wife,and publicize any personal information that might make him more vulnerable to identity fraud“.

I wonder if I am the only one who sees this as being incredibly dumb? Poindexter is a PhD Physicist, former Naval Admiral, friend of spooks (including the current commander-in-chief), and now controls the information being fed to the most powerful force-projection instrument in the history of the earth. And he is obviously a survivor, having resurrected miraculously from the flames of Iran-Contra.

It is one thing to stage an anti-warprotest at some college dean’s office in San Francisco, but it is quite another to provoke a fight with the beastthat has spawned fromDoD’s marriage with domestic intelligence. The “death by a thousand cuts” that Gilmore warns about is already a reality for people who get on the wrong side of the Pentagon, andPoindexter’s giant database has never been a prerequisite.Every fool knows that bad things happen to people who annoy the CIA or FBI, and anyone who claims that they did soand got away with it is probably already owned by the FBI. If it took the public and the heroic SF Weekly nearly twenty years to get proof that Sharpton was an FBI stooge (and then by a deliberate leak from unknown sources), one wonders how the plucky SF Weekly guy plans to emerge unscathed from his battle to “harass the elderly wives of powerful men.”

Clean Underpants v2

I speculated earlier about the algorithms that led to Amazon recommending “Clean Underpants with that Book?”, but now it looks like Mitch Wagner has uncovered the true reasonfor Amazon’s fashion choices. Apparently Amazon deliberately chose wacky items, purportedly to make it more readily apparent to people that the algorithms were not yet primed with sufficient purchase history to be accurate. This is not a bad idea, and undoubtedly contributed to some extra word-of-blog and word-of mouth advertising for their new apparel store.