Tuesday, September 08, 2009

mvn -Dtest= not working for you?

mvn -Dtest=FooTest test is supposed to only run the test FooTest.java.

I can only get it to work when all the dependencies are in the repo. I wish I knew why, or how to overcome that, but there it is. Took a while to figure out.

Tuesday, June 02, 2009

maven war plugin warSourceIncludes meaning

Arrgg!!! Bloody #@$%^ documentation!

You use warSourceIncludes without warSourceExcludes. Then anything not listed in warSourceIncludes will be excluded.

Kind of totally not like how pattern sets work, which the documentation refers you to.

There's also no wiki page where I could have put this.

/rb

Thursday, May 01, 2008

Passing Events instead of Widgets to Listeners in the Google Web Toolkit (GWT)

Level: Advanced
Date: Originally published 10/07

In GWT, a listener object A may attach itself to many other widgets. These widgets will call A's onChange (or similar) method, and pass themselves as onChange's only parameter. Inside onChange, A must then decide which widget called it. At this point a programmer unfamiliar with the code faces a problem: there is no easy way for her to tell which of the widget's many methods and properties are intended to be used by A. The solution given here is to create an interface, named [widget's name]Event, that has only methods to be used inside of onChange. This makes for clean and understandable code.

The full text of this post is available at http://docs.google.com/Doc?id=dw2zgx2_331ns4qh527.

How to Integrate Spring 2.x with the Google Web Toolkit (GWT)

Level: Intermediate

This post explains how to manage your GWT server-side services with Spring and Spring MVC, and to inject Spring beans into them.


The complete post is available at http://docs.google.com/Doc?docid=dw2zgx2_25492p5qxfq&hl=en.

Date: Originally published by Google 10/07, updated 4/08.

Stubbing RPC calls in Google Web Toolkit's Hosted Mode

Level: Intermediate

In this post I describe a way to "fake" or "stub" your RPC calls in GWT's hosted mode. It allows you to deploy to web mode without having to change any code, change any configurations, and without any stub or development-only code being deployed. It's very clean, and will work with any web application setup (for example, Spring is optional).

The full post is available at http://docs.google.com/Doc?docid=dw2zgx2_106fd4mtt&hl=en.

Date: Originally published by Google 10/07, updated 4/08.

Sunday, April 27, 2008

Dojo modules explained

Level: Intermediate
Date: Originally posted 3/12/08, updated 4/08.

There is a catch-22 in Dojo for what convention to use when naming a module and its classes. Before I can explain it, I need to describe what dojo.require(), dojo.provide(), and dojo.declare() actually do. Then I'll return to the convention problems surrounding what arguments -- namespaces, actually -- to pass them. That way, you'll be able to make an informed decision on how to handle the convention problem yourself.

The full post is available here:

http://docs.google.com/Doc?id=dw2zgx2_328ck7m86dk

dojoAttachEvent: what it's for and how to use it

Level: Intermediate
Date: Originally posted 3/08, updated 4/08.

You've written a widget, and you want other programmers to be able to change how it responds to events without forcing them to have to edit your existing event code. (In other words, you want your widget's event handling to be extensible without being modifiable.)

The full post is located here:

http://docs.google.com/Doc?docid=dw2zgx2_325c42tg8dx&hl=en

Dojo's cross-browser css solution and how to use it

Level: Introductory
Date: Originally posted 3/08, updated 4/08.

It's old news that browsers interpret css differently, and that there is no one way best way to deal with this exasperating problem, but Dojo has a clever solution. The following steps explain how to use it, and how it works.

The full post is located here:

http://docs.google.com/Doc?docid=dw2zgx2_327d8q5r7gk&hl=en

Wednesday, March 19, 2008

dojoAttachPoint: what it's for and how to use it

Level: Intermediate
Date: Originally posted 3/08, updated 4/08.

In the Javascript of a widget, you often might wish to refer to some of its html template's dom nodes directly. For example in the dijit.layout.AccordionPane widget, the js might want to access the nodes for the title, the title's text, the container or accordion pane, and so on.

You might think the widget author could just use ids in the html template, and then dojo.byId() in the widget's js. But if she does, then if two or more widget instances are created, they'll all have the same ids! Obviously the Javascript code will blow up then -- your DOM will have non-unique ids in it.

Instead, you the widget author must do the following:

http://docs.google.com/Doc?docid=dw2zgx2_326c39283gq&hl=en

Saturday, February 24, 2007

The Dirichlet Process

The Problem

Assume you wish to build a phylogenetic tree of S species using the same gene in each species. This would produce phylogenetic tree T1. (Note that with most methods including this one, you may well get more than one equally likely tree.)

If you were to repeat this process with a single second gene, you would get tree T2. For many reasons, including the randomness of molecular variation, the trees T1 and T2 would not be equal.

In short, n genes will produce n different trees.

The problem is: from this data, how can we derive a tree that most accurately represents the the actual evolutionary relationships between the S species?

The topology of the tree -- how many branches it has -- shows this information (as in a cladogram?), whereas the length of the branches gives us the additional information of either how far apart in time the branches are, or what the rate of evolution is (if you can pinpoint actual times using e.g. the fossil and geological record). We're only interested in the problem of the topology: how many branches are there, and where do they branch from?

There are two general ways to attempt to derive topology, each of which is at either end of a spectrum. The third -- the one we're after -- lies somewhere in the middle. They are:

  1. Assume the genes' evolution is completely independent of each other, at rates that are independent of each other. In that case, for n genes, you are likely to end up with at least n topologies. The assumption is unrealistic. It should be obvious that genes can influence each other's evolution. To use a crude example that puts the point into stark relief: if an allele arises that permits night vision in an environment when a nocturnal niche is available, and so becomes selected for, other alleles somewhat conducive to nocturnal living will become selected for as well.

  2. Assume that all the genes evolve at exactly the same rate. You'll likely end up with just one topology. A straightforward way to derive a single tree, on this assumption, is to concatenate all n genes into one sequence, and pretend that they are all one gene. That way a single tree can easily be calculated. This approach is suspiciously simplistic.

    (There is a separate argument against simple concatenation. It may even be the case that we get a different single tree depending on what order the genes are concatenated, so that we don't end up with one tree after all. Potentially up to n factorial trees might result, instead of one. That's a problem with concatenation of the genes, though, not necessarily with assuming all genes evolve at the same rate. Concatenation is just one way to try to capture this assumption mathematically; no doubt there other, better ways that avoid the above problems.)

  3. Assume that the genes influence each others' evolution, and so their rates of evolution. This assumption lies somewhere in between total independence of the genes' rates of evolution, and the rates being identical. The problem is to find a method of measurement that captures this assumption mathematically, so we can actually measure using it. This is the problem that the Dirichlet method is supposed to solve.

The Solution

When we calculate n trees for n genes, the n trees all approximate some ideal tree which reflects the temporal and genetic relatedness of each species' genome. In other words, there is a "true" phylogenetic tree that we will never actually know, which each gene's tree is an approximation of.

There are different ways to measure how closely related a tree is to another tree. On any reasonable measure, we can expect more genes to "cluster" around the "true" tree than not to. So one approach is to measure how the n trees cluster together, on the assumption that the "true" tree lies somewhere inside the tightest cluster.

We now present a method for clustering the trees which takes into account that their underlying genes' do influence each others' evolution. What we don't describe is (1) why this clustering method is realistic or more accurate than, say, throwing trees at a dartboard while drunk and standing on one foot (because we don't understand why), and (2) once the best cluster has been identified, how the best consensus tree is derived from it (because we didn't understand that either).

The Chinese Restaurant Process

Assume a Chinese restaurant with an infinite number of tables and an infinite number of dishes. Whenever a guest comes in, he will either be seated at a table at which other guests are already sitting, or he will be seated an an empty table.

  • If he is to be seated at a table with other guests, then (i) the probability that he ends up at a table is proportional to the number of guests at a table; and (ii) whatever dish the other guests are having, he must have too (so every table serves only one dish).
  • If he is to be seated at an empty table, then his dish will be randomly determined by the chef.

If you think of the guests as genes, and the dishes any particular gene's phylogenetic tree, you can see how this process would lead to clustering. Genes that walk in the door would tend to end up at, or cluster at, the table with most genes. Unfortunately, of course, the clustering has nothing to do with any measure of how related trees are. It's a mixture of pure chance and whichever table the first gene was seated at: that table is the most likely to end up with the most genes.

I know what you're thinking, but stick with me, my story gets better.

The Polya Urn

Remember that when the guest walks into the restaurant, a decision must be made: either he will be seated at an empty table, or at a table with guests. How is that decision made? With the Polya urn, of course.

Imagine a deck of cards. And an urn. And a mathematician with an irritatingly silly imagination. Forget the mathematician, stick with the cards and the urn. And try to concentrate. Please.

The deck has one unique card for each gene, and a joker. We begin by putting the joker into the urn.

When the first guest G1 walks in, we pull a card out of the urn; in this case, the joker. To pull the joker always means that you get seated at an empty table AND your card and the joker get put back into the urn. Done.

The second guest G2 walks in. There are two cards in the urn now, G1's and the joker. If G1's card is pulled, G2 is seated at G1's table, *and* both G1's and G2's card is put into the urn. The urn now has three cards.

You should be able to see now how using the urn ensures that, if you are going to be seated at a table with other guests, you are more likely to end up at a table with the most guests.

So how does this make the story better? It doesn't. We make the story better with the joker.

Joker Edge Cases

First, imagine that instead of just one joker in the urn, there are an infinite number of jokers in it. In that case, every time you pull a card from the urn, it is guaranteed to be a joker. That means every guest will end up alone at a table.

Second, imagine the urn with no jokers. In that case, all the guests will end up at the same table. After all, the only way you can end up at a new table is by pulling a joker -- and we've removed it.

What these two edge cases reveal is this. Having no joker is like concatenating all the genes in order to end up with one (simplistic and unrealistic) phylogenetic tree. It's like saying that all the genes are so closely "related" that they are just one gene. Having an infinite number of jokers is like saying the opposite, that no gene influences any other gene, and their trees are no more similar to each other than chance.

The truth, of course, lies somewhere in between, but where? Which (integer or real) number of jokers is correct? Whatever it is, it is a factor, a parameter, a number, and it is called alpha.

Alpha

If we can get alpha right, then our genes would cluster around the tree closest to the "true" tree. Or (roughly) equal numbers of them would cluster around those trees closest to the "true" tree.

Remaining Questions


  • Even with the right alpha, repeating this process is surely not always going to yield the same tree: which tree gets the biggest cluster is (I think?) somewhat favored by which table was seated first. So which clustering is the right one?
  • Surely clustering should be a function of some measure of how similar or related trees are to each other. It does not. Chance and alpha alone determine clustering, and not even uniquely (previous quation). This algorithm can model clustering in as much as the entities have some relationship to each other, rather than being independent. But the nature of that similarity (y = x*x, or y = ln(x), or whatever) is not captured.
  • The answer to "How do we calculate alpha?" appears to be "Go this website, fill out a form, and press Submit," which is not wholly satisfactory.
  • Which part of all this was the Dirichlet process??

/r:b:, 2/11/07

Sunday, June 11, 2006

The Introduction to the Critique of Pure Reason I never had

This essay was written for fellow students to help prepare for an exam on the Kritik der Reinen Vernunft. Many of its insights are Robert Paul Wolff's, but I think the observation about the nature of the words Wissen and Erkenntnis is mine.

The Kritik is structured like a wagon wheel. At its center, the hub around which all else revolves, is the defense of a theory of discursive knowledge (a central term I shall define shortly), and branching out from it are the spokes of the theory's consequences so large that they dwarf the hub and make us forget that it is at the center. Once the reader has grasped this structure of the book, all the seemingly disconnected themes and strange pronouncements begin to make sense. For what obscures our understanding more than anything is our Humean view of knowledge that makes Kant's discursive view sound like nonsense - until we realize his discursive model is not assumed, but what he has set out to prove.


The Humean view of how knowledge is acquired is roughly encyclopedic: as entries are added to a book, so facts, insights, smells, melodies, and other data are added to that store of the mind which we call knowledge. Thus acquiring knowledge is, apart from still unknown empirical archival processes, a passive activity. Immediately perceived simple ideas or impressions, once stored, are combined into complex ideas and insights by the imagination, reason, and so forth. To end with an example, if you feel a pin-prick, you know and know of that feeling directly.


For Kant, this is at a great remove from knowledge. It is like saying that a heat-seeking rocket "knows" the heat it senses. A seemingly inseparable characteristic of knowledge lacking from the Humean picture is that of consciously knowing something. Kant adds this element by positing knowledge of something not to be simply passive knowledge of that thing, but a realization of what kind of a thing it is. The sharp pain from the pin is not an item of knowledge, it is a piece of sense data, which Kant calls an intuition. Only when it dawns on your mind: "Aha! That sense datum was of the type pin-prick! That was a pin-prick!", then only has Kantian knowledge come into being and been acquired. For Kant, to know a thing is only to know what kind of a thing it is. Thus all knowledge takes the form of a judgment, specifically of matching a particular intuition (or concept) with the concept it falls under. To know that Secretariat is a horse is to have knowledge, but neither the empirical experience of Secretariat nor the empirical concept of a horse alone constitute knowledge: for knowledge, the two must be combined, and Kant argues that they are combined by making a judgment. For Hume, the sense datum alone would have constituted something known, rather than merely felt, and have constituted an item of knowledge. The alternative theory of knowledge defended by Kant is called discursive, and from now on we must realize that when Kant speaks of knowledge, he is speaking of discursive knowledge. Humean knowledge is passive, Kantian is active.


This difference between two rival views of knowledge stands out starkly in Kant's German, but not in English. If you look up "knowledge" in an English-German dictionary, you will find "Wissen," an exact translation of the Humean view. Yet the word "Wissen" hardly ever occurs in Kant's huge treatise on knowledge: the index has only two references! The word Kant uses for knowledge is "Erkenntnis," which my dictionary doesn't even list under "knowledge"! For an Erkenntnis is not something you know or passively possess, like "Wissen," but something you actively acquire. It is a literary insight, a scientific discovery, a sudden illumination, an "aha" experience. The English word "insight" comes much closer to Erkenntnis than "knowledge."


The German verb corresponding to Wissen, wissen, is a highly passive verb, while that corresponding to Erkenntnis, erkennen, is active. To "wissen" something connotes as much activity as the English (and the German, for that matter) "to have" something. To "wissen" something is to have it, to be sitting on it, without anything being conveyed about how it was obtained. To "erkennen" is completely different. To say that one has "erkannt" something is to convey not only that one now possesses it somehow, but that the act of obtaining or coming to know it was a realization, an insight, a vivid experience. "Wissen" is couch-potato knowledge, "erkennen" is both knowledge and the active process of discovery of knowledge.


So in German we can see immediately that Kant has a completely different view from our 20th century American view of what it is to know, and until we realize this, we will be adding yet another layer of incomprehensibility to the already thick one of his sibylline prose.


If we now step back and ask ourselves how anyone would naturally proceed to prove that a discursive theory of knowledge is correct, we soon find that Kant has indeed taken predictable steps, and that the overall organization of the Kritik can in part be easily picked out. Thus short reflection reveals that our project should consist of at least three parts. First, since this is a philosophical theory, we cannot prove it by investigating the physiology or psychology of the mind. We must instead seek to show that various true results follow from assuming the theory to be true. These various consequences are the great spokes of the wagon wheel that is the Kritik, such as the refutation of idealism, the analysis of the arguments for the existence of God, and many other large themes that at first sight have no connection with a mere theory of knowledge. In addition to the consequences, however, and in order to derive them later, we will first have to spell out and develop our discursive theory in more detail than just "all knowledge is a judgment." And if we reflect for a moment on how to proceed with this task, we quickly notice that since the knowledge-engendering judgment is divided into a sensory intuition and forming an intellectual concept, we might do well to develop our theory by elaborating its sensory and then its intellectual forks, which when united in a judgment produce knowledge. And this is in fact exactly what Kant himself does. The Transcendental Aesthetic develops the intuitive component, and the various deductions develop both the intellectual component and the details of how the two are fused, in that order.


The Transcendental Aesthetic: The Intuitive Fork


When Hume set out to investigate immediate sense impressions (which he called impressions or ideas and Kant called intuitions), he introspected diligently and had much to say about how they combined and associated with each other to form more complex, secondary ideas.


Not so Kant. If you think about it, the basic split between idealism and realism occurs right after this point of perception. The realists go on to argue that there really is something out there triggering our feelings, and the idealists go on to argue that there is nothing, or at least that we can always doubt there to be something out there, and that the world is simply our impressions and ideas - hence they are called idealists. Kant, in yet another sweeping spoke, found both these positions to be fundamentally flawed, beginning right at this first step of how they viewed immediate perception. Kant's move is to seize on the fact that all these philosophies assume the "spatialness" and temporality of objects, and to propose, as a realist, that although there are really objects "out there," spatiality and temporality are not intrinsic properties of these entities. Instead these properties are merely how humans can become aware of them at all. As Kant puts it, space and time are conditions of experience. In other words, a crucial mistake of idealists and realists is to argue over whether there are spatio-temporal objects, without realizing that spatio-temporality may be in doubt without the existence of external objects being in doubt. An example should make this clearer.


Imagine a log of driftwood bobbing in the ocean. Nearby is a primitive sea-creature whose only sense is that of smell. It knows many different kinds of smells, in some sense or other of "know" that allows it to recall and re-identify smells. Thus everything in the world for it is a smell, as is the driftwood with its distinctive smell. Kant might say that the form of this creature's experiences is odorific (especially since he loved to invent contorted labels). It is also quite likely that the creature does not "know" about space, about near or far, up or down; if things exist or are, for it they just smell, but they are not anywhere or perhaps even at any time. It perceives only smells, and conceives of things that way as well (as in the concept of the driftwood smell). Mutatis mutandis, we can imagine a near-by creature that can only hear. A person, however, has, amongst other things, a spatial and temporal notion of the driftwood, and indeed of all objects external to him or herself. In this sense, then, space and time are conditions of a person's experience - just as smell or hearing are conditions of the above sea creatures' experience. The way we experience material objects is such that we conceive of them as being somewhere in space and time. There may well be other ways of conceiving of objects that are impossible for humans. No one conception (or perception) is the correct one: space and time are just possible ways of conceiving of objects of experience, and there may be others. Thus we - and Kant - can speak of the object in itself as that object which can possibly be conceived of in a number of different ways, of which no one particular way is the real or correct conception of the object.


Thus just as the only way to see and so to perceive H.G. Wells' invisible man would be to throw paint on him, so we must throw the paint of space and time on material objects; other creatures have perhaps only the paint of smell, or sound, or tactility, or something we donÕt know of. In short, idealism and realism take the properties of space and time for granted, and Kant believes that by not doing so, he can resolve the difficulties of these philosophies.


Synthetic and Analytic


We are almost done now with the intuitive fork; if anything, we have said too much. There is only one more, notorious issue to point out. We can see now why it makes sense to speak of space and time as forms of intuition or experience: everything we experience is either in time or in both space and time, and never outside of these, it is always in temporal or spatial and temporal form. But where do these forms come from? We cannot have learned them inductively from experience, at however young an age, for experience itself is impossible without them!! Every thing we intuit is a temporal thing, or a spatiotemporal thing, so we may tentatively call space and time concepts.[1] Since they are prior to experience, they are a priori. Since they therefore cannot be derived from any object of experience, but add something intrinsic and inseparable to it, they are synthetic, rather than analytic, as well as a priori concepts. Swathes of the Kritik are devoted to showing that such entities can exist, and indeed, Kant cannot avoid this issue. For in the intellectual fork he will again have to posit such things.


The Metaphysical and Transcendental Deduction: The Intellectual Fork


Now if indeed we do gain knowledge through acts of judgment which are conceptualizations ("This intuition is that kind of a thing"), an idea suggests itself: there may be some basic forms of judgment. After all, in formal logic, the statement "p is C" has a certain number of different yet basic forms: p is not C, all p are C, all p are not C, some p are C, possibly p is C, and so on. Here, p is the grammatical subject, and C is a predicate of p. These rules are general because they tell us nothing about conditions for a C to be correctly applied to a p. In other words, this set of forms of judgment tells us nothing about how to determine the truth-value of any one form. All they tell us is how to combine various forms IF they are true, or IF they are false. Discursive knowledge judgments, on the other hand, do have truth-values. Thus, of an intuited event, we could correctly or incorrectly judge that it was a causal event, i.e. a causal type of event.


What Kant believes is that some forms of discursive judgments are basic because they contain basic concepts, one of which he thinks is, say, the concept of a causal event. These basic forms he terms categories. Mysteriously, he thinks that the category of causality can be inferred from the general form of the hypothetical relation between judgments: "if (p is C) then (q is D)." Similarly that of existence from the assertoric modal form; and so on for all the other twelve forms of general logic he identifies.


It matters little whether he can or not. First, he has a quite separate derivation of his treasured categories later, in the Analytic. Secondly, important arguments later are merely for the possible existence of categories in general, and thus independent of whether the ones he has derived are the true ones.


Well, if nothing else, the above Metaphysical Deduction gives us the idea that perhaps there are basic concepts that underlie all others, and which subsume an intuition in a discursive judgment. It also suggests what these basic concepts or categories might look like. We are now ready for the final push, the unavoidable core of Kant's discursive theory of knowledge: how concepts must fuse with intuitions, and the nature of the resulting knowledge. In short, the nature of judging, and the nature of judgments. Until he has spelled out his theory in detail, he cannot move on to derive consequences that may show it to be true. This description of the act of coming to know lies in the Transcendental Deduction. Once again, we can best follow it by first musing about how we ourselves might undertake it.


Discursive knowledge of a single intuition really presents no difficulties. It is brought under a concept, i.e. what kind of intuition it is is recognized, and that's that: it is known. But what of a multitude of intuitions? Perceiving a particular horse will involve varied immediate sights, smells, sounds, and so on. How to bring so many intuitions under one concept of "horse"? Kant's reasonable if ultimately empirical guess is that this cannot happen. Instead, each individual intuition must be conceptualized, producing judgments that he in the Transcendental Deduction calls representations. Then the representations are somehow spontaneously fused into one entity in an act of synthesis, which can then be brought under a concept. Thus is a horse known.


But all this is conjecture - any science fiction writer can produce similar or disparate models of knowing. Kant moves it towards a proof, with an argument that need not convince us but that we should at least understand. Look, he says, it must be logically possible to preface every representation with "I think," for if it were not true to state that I thought my own representation, how could it be mine. (Here, "to think" means nothing other than bringing under a concept.) So much logic demands. Furthermore, the only way I could ever become self-conscious, that is, aware of myself as one more entity in the world, would be to notice that all these "I think"s refer to the same I. Yet the only time I can have more than one "I think" present to mind is during synthesis. Hence the fact that I am self-conscious is sufficient evidence that synthesis does occur in the mind, or at least that it occurred once, namely when I achieved self-consciousness.


But there must be more to be said, for while this describes discursive knowledge, such knowledge has not been fully described without mention of how it relates to what it is knowledge of! The synthesized manifold must refer to a determinate object, like the horse Secretariat, and the concept synthesizing the manifold - for a concept is what must synthesize it - will be the concept of Secretariat. Since a manifold can only be unified or fused inside one mind or consciousness, this unity of consciousness is a necessary condition of synthesis, and of the synthesized manifold having (i.e. referring to) an object. For Kant, it is arguably even a sufficient condition.[2]


In sum, that I have only one mind and that I am self-conscious is argued to be a necessary and sufficient condition for discursive knowledge. It is almost left as obvious that the concepts that carry out synthesis of the manifold are the categories, and so are also conditions of knowledge.


I now leave it to the reader to show how the Schematism and Analogies contain both the central theme or "hub" of discursive knowledge, and the supporting spokes (like the refutation of idealism). Good luck to us all on the exam!


/r:b:





[1]Over Kant's protests in the Aesthetic, which he later violates himself. Just for the record, his argument there is essentially that since he can allegedly show that space is not a concept, and knowledge of the world can only consist of concepts and intuitions, it must be an intuition.Ditto for time.


[2]Kant's Transcendental Idealism, Henry Allison, pp. 147-8.