Fraud Inc.

CNN is running a page called “Fraud Inc.“, where they sensationalize all of the evil corporate cheats. Funny they mention nothing about the SEC investigation of their parent company (although they have a link to a pre-SEC revelation editorial which is an unabashedly partisan defensive piece claiming that AOL is getting a bad rap from people overreacting). Fortune Magazine is running an article (published 14 days in the future?) puffing up the heroic efforts of the new management, again ignoring the SEC investigation of their parent.

Throw me a bone, people! All I’m asking for is a freekin’ pretense of objectivity here!

Dow 1000

The Dow took 76 years to reach 1,000 (and didn’t stay above 1000 until 11 years after that). It took another 14 years to hit 2,000 for the first time. And 1995 was the first time that the Dow reached 4,000.

1995 was a good year.

Some Rationale for TIPS (Or, the Coolest Job in Defense Department)

Some while ago, it was reported that the FBI isusing data mining to assist in fighting terrorism. The idea is very appealing in theory. Every terrorist incident, in retrospect, has some characteristics that “could have prevented the attack if only someone had connected the dots.” The problem with terrorism, though, is that the “dots” are like grains of sand in the sea. To find the dots in advance and connect them would require such an exhaustive effort that the expense would probablybe far greater than the expense of just letting the attack happen.

Data Mining provides a start of a solution, though. Simply feed massive amounts of raw data into the computer, and let the computer connect the dots. The computer won’t necessarily be able to prevent an attack, but it can highlight suspicious events or people, and can speed up investigations before and after attacks. Law enforcement has been using data mining in more limited contexts for many years, with great success; for example to flag suspicious credit card activity.

Unfortunately, the sort of traces that can lead to a terrorist are not as easy to measure and catalogue as are credit card receipts. And data mining algorithms work best on large representative samples of data. This is where I believe that Operation TIPS comes in. The government is already collecting raw information on phone calls, e-mail, hotel registers, and rental car receipts. Other obvious sources could include store purchase receipts, license plate logs from highway control camera systems, etc. As long as this information is used in aggregate, the government’s collection of it is likely to withstand any fourth ammendment challenge.

By itself, I think that raw data from “citizen informers” is unlikely to be very useful. There is guaranteed to be a large amount of incorrect or even deliberately misleading information submitted. As a standalone initiative, I would guess that the only real benefit of such a hotline would be to make citizens feel empowered and involved, while the tips would continue to be ignored as they always have. But combined with a larger data mining system, I think that “citizen informer” tips would be very useful. In cases where the system detected suspicious patterns in the objective, fact-based data described earlier, it would be smart to cross-check and see if there were any subjective observations submitted through Operation TIPS that could validate the suspicion.

Although data mining state of the art is advancing quickly, the task of being able to pick out a few bad apples from a population of billions is obviously not something you solve with”off the shelf” technologies. This is where the DARPA Information Awareness Office comes in. It’s a new office of Department of Defense, headed by John Poindexter, and chartered with developing the database, indexing, and mining techniques necessary to make sense of these massive amounts of dynamic data.

From my perspective as a citizen, I think that DARPA IAO, Operation TIPS, and things like the X-45 swarmsare alarming. Even if they aren’t outlawed by the constitution, they should be. These technologies are being developed ostensibly to protect the population from being terrorized by an asymmetric threat.

But these technologies in the wrong hands are a means of doing away with “the consent of the governed”. If a government wants torepressparticular segments of the population, they have historically needed to use a proportionally large number of police and army to do so. These police and army personnel are themselves citizens, and capable of withdrawing their consent, so the government needs to be careful not to alienate them. And the government must be extremely careful not to allow the repression to be felt in non-target populations, lest those populations join in the revolt and replace the government. On the other hand, the technologies needed to target terrorists with precision are technologies that give their consent freely to whoever uses them, and permit opponents of the government to be picked off with precision that can evade the most vigilant of patriots.

Essentially, rulers rule by consent of the governed only so long as the governed are capable of mounting a resistence that equals the power of the rulers. Simply by way of numbers, “the governed” in modern democracies have always posed a symmetrical threat capability to overthrow the government should it become too corrupt. Paradoxically, though, the tools used by a government to fight the asymmetric threat of terrorism are the same tools that allow a government to gain asymmetrical control over its own people. People in modern democracies have no experience in fighting an asymmetric threat posed by a government. The best strategy, then, would be to prevent such an asymmetric threat from forming in the first place.

(From my perspective as a technical person, though, a job in DARPA IAO is the coolest job in DoD a person could have.)

URI Ethics

The debate over the range of HTTP rages on. The core of the debate is whether or not it is OK for an http: identifier to identifysomething other than hypermedia, like acar. The only answer is “no”.

URIs are the words of the Internet. Sometimes words are ambiguous, but words are normally expected to mean things. That is what they are for. URIs (and URLs) identify things. That is what they were created for, and what they are used for. It is the only purpose of URIs. When we use words in communication, we always try to use the most appropriate words and avoid words which could easily be interpreted to mean something different than we wish to convey. By the same token, when we attempt to convey meaning on the semantic web, our systems will have to do their best to choose the right URIs.

I find it alarming that people are still arguing in favor of using http: identifiers to identify automobiles. I might write an essay later about the massive logical flaws and wack thinking of the people arguing to abuse URIs, but it’s only depressing that something so fundamental could require explaining. The thing that is alarming is that the stakes are so high. The issue of URIs is foundational to the entire web, and especially the semantic web. It is the issue upon which everything else rests.

Dag Hammarskjoldsums it up: “Respect for the word is the first commandment in the discipline by which a man can be educated to maturity – intellectual, emotional, and moral. Respect for the word – to employ it with scrupulous care and an incorruptible heartfelt love of truth – is essential if there is to be any growth in a society or in the human race. To misuse the word is to show contempt for man. It undermines the bridges and poisons the wells. It causes man to regress down the long path of his evolution.”

Hammarskjold in this quote explains that human evolution is critically dependent on our use of words,and that misuse of words harms the human race. He’s not engaged in speculation about whether or not word mean things — it really doesn’t matter if the meanings of wordsare sometimes ambiguous, nor does it matter that word meanings are a matter of convention. He’s talking about how we use words. His conclusions are self-evident, regardless of whether you prefer Rand or Chomsky. The other thing to notice is that he calls this the first commandment. This also is self-evident. Ability to communicate thoughts reliably to other humans is the most basic prerequisite of forming a civil society.

Everything that Hammarskjold says about words applies doubly to URIs. The Internet is nothing more than a tool through which humans communicate with one another. The radio and telephone have mostly eliminated geography as a significantimpediment to human communication, and the WWW is slowly eliminating memory as a significant impediment.But we are still using words to communicate — these revolutions don’t absolve us from our responsibilities to respect the word. In fact, as we gain more power to communicate, abuse of words has power to spread the poison much wider than ever before.

Search engines like Google demonstrate that the next wave of communications evolution is to eliminate “discoverability” as an impediment to human communications. This is the “semantic web” — if 10,000 different people have made public comments about my product, I don’t want to have to visit 10,000 different web sites to find them all. And if I want to make a comment about somebody else’s product, I want to be able to just post the comment to the “cloud” with full confidence that they will be able to get it (if they desire), and without having to worry about whether I am posting to the right web site or not. Unfortunately, it’s not likely that Google is going to be able to collect product reviews about my product unless everyone posting reviews remembers to add some information saying “this is a product review”, and “it is about product X”. Humans attempting to aggregate all of the product reviews by hand would probably have no trouble dealing with ambiguities, misspellings, and so on. But the goal of the semantic web is to eliminate the need for humans to act as middlemen in these sorts of loosely-coupled conversations, and machines aren’t as good as humans at dealing with ambiguity (let alone capricious misuse) in language. Therefore, respect for the word (which is a URI on the Internet)is of paramount importance to the semantic web.

The semantic web will not work if people are encouraged to gratuitously misuse URIs. The semantic web will not work if URI meaning is dependent on things likeinference based on dereferencing (HTTP GET — this would be the same as if, in conversation, you needed to look up every word in the dictionary, and the dictionary changed by the hour). The potential benefits to human society which could result in us eliminating discoverability as a communications impediment are huge. But it is frightening that we have people who can so dazzle and mystify themselves with their own sophistry that they fail to notice themselves trampling on the foundation of something so promising.

Oh my!

This Thomas C Greene guy at The Reg needs some adult supervision. Apparently he heard some signifyin’ from a guy named “Gweeds”, and he’s now an expert on hacker sellouts. Greene is immobilizedby slavish admirationin his first article, lapping at the feet of “Gweeds” like a dog. It is really shameful, and embarassing for The Reg. Some of the choice bits:

  • Hacker conferences exaggerate security threats in order to get more money for sellout hackers. Oh, really? Mr. Greene can keep his head in the sand if he wants, butrational people need only look at the billions of dollars of cost incurred by the last crop of worms — all of which were created by amateur hobbyists. The threat of professional hackers is even greater, and most of the plausible scenarios considered by the defense information warfare agencies are certainly not public knowledge, let alone proffered up at conferences.
  • Hackers are corrupted by greed and abandon their roots in “liberating data”. This is so pathetic. I know far more about hacking and hackers than I know about Java, and I can unequivocally say that hacking has nothing to do with “liberating data”. Silly phrases like “information wants to be free” were fun and rebellious, but nobody is actually motivated by that crap. Hackers are motivated by ego, period. Sometimes for the selfish pleasure of solving a tough technical challenge, but more often externally-directed ego. Exercising your control over the system, proving that you are smarter than everyone else, smiting a competitor, having the entire system trying to stop you but coming through with something to brag about — that is what keeps hackers going. Hackers are incredibly egotistical, and the whole idea of a “Robin Hood” hacker was just a joke hatched up by hackers to get more attention. Nobody who actually has any skills would take such crap seriously.
  • Mudge, Hobbit, and Weld are sellouts. This is one of the strangest parts of the “Gweeds” rant. He makes up an inane definition of what a “true hacker” is, and then proceeds to excoriate anyone more famous than him for “abandoning” the philosophy. But if they never espoused it, how could they abandon it? I have news forGreene –hackers get respect in the community from one thing and one thing only: skills. And that is what is consistent across “sellouts” like Weld Pond, ReDragon, and especially Mudge and Hobbit. They demonstrate skills, technique, and raw mental horsepower that go far beyond the average hacker. As far as I know, none of them ever claimed to be opposed to making money, and I don’t think that any of them would have wasted valuable time arguing about “philisophical purity” when they could better spend the time sharpening skills.

Anyway, Greene got some flames, and got really defensive. He starts out by claiming that he was “just the messenger”, as if that somehow absolves him of stupidity for being so credulous in the first place. He the proceeds to launch into a paranoid rant about somebody’s brother’s cousin and the FBI or something. As if the FBI haven’t been openly recruiting hackers for decades. Greene claims to have evidence of some grand conspiracy, but strangely doesn’t offer to show us the evidence.He would make even Traficant proud.

This year’s Defcon looks to be the best one yet. The speaker lineup has some extremely technical content, far more expert speakers than years past. Unfortunately, I probably won’t be able to go this year. I could go for the Saturday and Sunday sessions (which look like the best ones anyway), but the thought of being in airports three weekends in a row is not too appealing. We’ll see…

XML Diff and Patch

One of the pieces of software I’ve been involved with for the past months is now up as a demo on gotdotnet.com. The primary purpose of the tool is to be able to quickly detect node-level changes between versions of an XML document, and with enough granularity to support efficient patch and merge scenarios. The patch format can be used for fairly terse delta-encoding to transfer incremental changes across the wire as well. The tool is officially called “XmlDiff and Patch” and is implemented in managed code (sorry, no MSXML version). Currently you can only perform XML Diff through the web page, but the assembly should be available on that site very soon. I think it is considerably better than anything else available today, in terms of both performance and accuracy, and the API is a good fit with the System.Xml libraries, so it should be really easy for any VB/J#/C#/F# programmers to use.

Now I’m off to New Orleans for a week to investigate the pralines, beniets, muffalettas, crawfish, and other foodstuffs. I’ll be sure to call the TIPS hotline often with my reports. Of course, there is the tiny issue of the fourth amendmentin the Bill of Rights, which says “the right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.” Butit’s not even one of the top three basic rights, and besides, crawfish aren’t people. They make high pitched squealing noises when being boiled, but they’re not citizens.

Misrepresenting JUnit

Carnage4Life pointed out to me that I was misrepresenting JUnit yesterday. I didn’t realize that you could enumerate method names in Java reflection, but apparently you can, and JUnit uses this functionality at least as well as NUnit. So if you were reading yesterday’s postand thinking “is he smoking crack?”, the answer is “none of your business”, but yes I was wrong about JUnit.

Now I will happily offer my personal opinions about the FSF kooks who crasheda recent Commerce Department panel meeting. First, those kooks don’t represent me, nor do they represent a majority of software developers, so I hope that the panel doesn’t mistake them for being a “populist” voice. Second, it is ironic how these people are attracted to the stink of politics and power like moths to the light. As a contrast, Mono today is completely self-hosting on Linux. What ever happened to DotGNU? How about the megalomanic “Free Encyclopedia” or whatever they called it? Why is nobody using hurd? I can tell you the answer — if they would sit down and start writing some code instead of running around trying to get close to press and politicians, they might actually get something done. Or maybe they are just afraid to try writing code because they would have to admit they are really not capable. No matter what the answer, it sure is ironic that these guys are so averse to competing through innovation, and instead gravitate toward political activism.

In any case, the sheer hypocrisy is par for the course for any despotic “revolutionaries”. Pol Pot told the Cambodians that he was on the side of the little guy, but instead of doing anything for the little guy, he spent all of his time attacking intellectuals and anyone else he regarded as a threat, and attempting to take over the control mechanisms of government. This rant against the GPL sums it up nicely, quoting Brett Glass; “Meet the new boss — same as the old boss.”

Productivity and Functional Programming

I just started using NUnit(prompted by a recommendation from Tristan at least a year ago). NUnit is a framework for writing and running test cases during development. At Microsoft, like most other places writing software, developers create and run their own “unit tests” (or sometimes called DRT, or “developer-run tests” internally) at the same time as developing the code. When each dev checks in her code, she runs her own DRTs. Then when the daily builds happen, there is another set of tests which are run to make sure that people didn’t break each other (called BVTs, or “build verification tests”). And of course a whole series of tests beyond that which are done by the testing team. I’m not aware of anyone internally using NUnit for DRTs, but that’s probably because we have a huge infrastructure of tools already in place for testing. I really like NUnit for the sort of things I do, though. NUnit started as a port of the Java-based framework called JUnit, but takes advantage of Reflection and other nice .NET features to make it really simple to integrate testing with the IDE projects. It’s basically just a test driver, though. The code-covergage extensions that were available for JUnit don’t seem to have been ported. Also nice is FXCop.

I have alsobeen playing recently with functional programming. There are a number of functional programming devotees inside of Microsoft, but the buzz is spreading to the pragmatists I think. In Why Functional Programming Matters, John Hughes positioned functional programming as an ideal “glue” for tying software components together. Mondrian was the first programming language I saw which was designed to be a functional “glue” with .NET components in mind. Since .NET componenets can be written in many different languages, and since CLR expressly tries to eliminate the hard workof cross-language callingit would seem that the ecosystem of reusable software components will flourish, and programmers can now focus on functional glue rather than tedious implementation details like marshalling and IDL. Of course, no sane development shop would use multiple languages gratuitously in a project, but the point of CLR is to free the shops from having to use the same language that their component library vendors use, etc.

Mondrian is a very simple language, and the one(of the “functional” languages on .NET)I have used most frequently. It is based on Haskell (I think). Mondrian mixes some imperative concepts with functional, similar to F#. F# is basically a port of Caml to .NET. On the other hand, SML.NET was just recently released, and is pure, unpolluted, functional (it implements Standard ML). I feel that SML.NET is more complex the Mondrian, but I am planning to use it for awhile and decide then.

All of these implementations of functional programming languages ran into similar issues with the CLR. One of the most important is that functional languages depend on polymorphic functions. Based in large part on feedback from the functional implementations, the CLR team has started implementing “generics” as part of the runtime. Another challenge comes from integrating functional programming with object oriented software componentswithout sacrificing the purity of the functional language. Mondrian and F# decided to go ahead and pollute the language a bit to permit eaier reuse of other people’s components. Personally,I thinkthat such hybrid languages will be the only way to be successful in providing functional “glue”, but maybe I’ll change my opinion after playing more with SML.NET. Also, it seems like polymorphic functions are a primary reason that John Hughes was recommending functional programming be used as glue. Therefore, when the CLR supports generics (and a few other cool enhancements that are planned), the normal .NET languages will be able to behave in a more “functional” manner, and I think the appeal of distinct functional languages as glue (the mainreason for the buzz right now, IMO)will be less.

Corporate (with a sneer)

Dave is noticing that politicians this year are attempting to tap into the collective discontent left behind by the bubble’s burst. I saw the article he links on the front page of Sunday’s New York Times, running right beside an article about candidates trying to smear one another with insinuations of corporate ties. The NYT article reveals that the largest growth in the work force right now is from people of retirement age. The article tells the story of a sympathetic couple, who sank all of their retirement savings into stocks at the peak of the bubble and are now surprised to see it gone. No matter that people like Warren Buffet and Alan Greenspan (and even me) were warning everyone that the market was irrational — when thousands of baby boomers blow their retirements, personal responsibility is the last thing anyone wants to talk about. And to be honest, I think that the people typified by the NYT article are doing remarkably well at accepting their part of the blame, and it’s only the politicians who want to make it seem like somebody else’s fault. ABC today is reporting that teens can’t find jobs, no doubt a bit surprised to find themselves competing with their grandparents for entry-level jobs. Teens don’t vote, though, so that won’t be a political issue.

The real political issue is about reality being less appealing than illusion. The typical heart-jerking story is about someone who had a bunch of money, and now the money is gone. The part the stories alwaysleave out is that the money never existed. The person just had the promise of money. Even in the case of NYT’s sympathetic couple, who had a significant amount of cash at one point, and then invested it, were living on imaginary projected retirement. The only way to claim that the money was real would be to play a “what if” game and say “they would have had the money if thy hadn’t invested it…” ABC news continues to propagate this confusion in their article about how the government cooks the books. The article tries to draw a parallel between the corporate scandals and the Bush tax cut by portraying the tax cut as a “hidden debt” that Bush needs to use creative accounting to hide. Tax cuts are great fun, because the politicians always calculate their cost against completely imaginary things like “the surplus” and “social security fund”. The best part about “surplus” and “social security” is that the press never have to say “hypothetical” in front, since everyone knows that they are hypothetical, and only pretends that they don’t when it comes time to campaign.

When the Pringles took their retirement money and decided to invest in stocks, I am certain that they had this discussion. They undoubtedly knew that the stock market could be risky, and knew that there were safer alternatives. They couldn’t have forseen the magnitude of the impact oflying CFOs, uninterested SEC, and sleeping press. But at the end of the day, they made the decision to put their retirement at risk. They can share the blame with the other culpable parties, but so far it seems that most people are reject the blanket abdication of responsibility being offered by the politicians.

The pols will keep trying, though. Sunday’s NYT was gratifying, in a way, since it was such a strong evidence that I was right when I predicted six months ago:

“This year, the opressor is not “white anglo male protestants”, “white racism”, “bigotry”, “polluters”, or “big government”. This year, the opressor is “corporations” and, by association, “the rich”. As the newly conservative establishment focuses on fighting shadow wars with such “enemies” as militant homosexuals, moral-sapping communist professors and the like, they seem to be missing the real offensive being mounted by the underdogs. Two years ago, everyone was rich and powerful, living in a fantasy world called “the largest legal creation of wealth in the history of mankind.” Joe Day Trader wasn’t envious of “the rich”, because he was “the rich”. Now fast-forward to fall of 2001, when everyone is deep in the “down” period that comes after a wild high. Everyone is wanting “just one more fix to make the pain go away, bring back our bubble!” But instead of another fix of market bull, they get an in-your-face demonstration that America is not secure, and then they get laid off from their jobs. This is the shape of mass discontent and disillusionment, and these times are a bonanza for divisive politics. It’s no longer the minority who are disaffected; it’s a freekin’ majority!”

Wealth Bondage has plenty to say about corporate malfeasance. You’ve got to admire the use of rhetoric in this little snip:

“The Free Market? In 20 years it will be 8 guys playing poker for trillions outside the Bank of Bermuda. They will use Fortune 100 Companies for table stakes and Politicians as Waiters. The public, employed mostly as Mules, will watch on TV, identify with the Winners, and thank God that they are Free.”