Thursday, January 28, 2010

Idiot Lesson On The Types Of SQL JOINs

I'm going to take a departure from my usual topics and talk briefly about a very basic concept - SQL JOINs. I'm doing this mainly for my benefit - I wanted to write this little nugget down so I don't forget it, and what better place to do so than my blog? :)

What's the difference between INNER, LEFT and RIGHT JOINs?

It seems every so often, I forget the answer to that question - I usually get confused ("what happens if there is a null - do I get a row in my results or not?"). Now, I'm sure it is easy for most people to remember how these JOINs work, but I always seem to forget. However, recently I got a good idiot lesson from Joseph Marques on this; he gave me a very short and easy way to remember the rules of these three types of JOINs. It was such an elegant way to explain it that I promised myself I would write it down so I don't forget it, so here it goes (again, I'm sure everyone knows this, and I'm sure I was even told this myself at some point in the past, but of course, I don't remember):

JOINs can be thought of in the context of a Venn diagram. You have two circles A and B. These circles partially overlap. (circles A and B represent your table data - when you join the two circles/tables together, the overlap represents relationships between the data.)
  • The overlap is the INNER join - it contains only the elements in A that also have associated B elements.

  • The A circle is the LEFT join - it contains all A elements regardless of whether they have associated B elements.

  • The B circle is the RIGHT join - it contains all B elements regardless of whether they have associated A elements.
Note: A LEFT or RIGHT JOIN are OUTER JOINs - sometimes you see these specified as "LEFT OUTER JOIN" or "RIGHT OUTER JOIN". Looking at the Venn diagram you can see why - they include the data outside of the inner overlap area (but also including the inner data). If you just see the word "JOIN" without a specifier, it typically is referring to an "INNER JOIN".

Below is an example complete with this Venn diagram visualization:

I have a Person table containing three people: John, Mary and Bob. I have a Phone table containing three phones that have extensions of 987, 555 and 123. John and Bob have been assigned phones (I can call John on extension 987, Bob on extension 123). Mary is not assigned a phone. The phone with the extension 555 is not assigned to any person.

The INNER join:
select name,dial_number
from person
inner join phone on person.phone_id=phone.phone_id

returns results as found in the overlapping area (John:987 and Bob:123).

The LEFT join:
select name,dial_number
from person
left join phone on person.phone_id=phone.phone_id

returns results as found in the entire left circle (John:987, Mary:null, Bob:123). Because the left table is the Person table, I have one row for each person, regardless of whether they have a phone assigned or not.

The RIGHT join:
select name,dial_number
from person
right join phone on person.phone_id=phone.phone_id

returns results as found in the entire right circle (John:987, null:555, Bob:123). Because the right table is the Phone table, I have one row for each phone, regardless of whether a person is assigned a phone or not.

I don't know why it is so hard for me to remember this - it's so obvious when you see it written down like that. But, nevertheless, I find it helpful to refresh my memory with that Venn diagram explanation whenever I experience a brain freeze while developing queries.

Wednesday, January 27, 2010

Java Decompiler For Java5 and 6 - A Jad Replacement!

A few years ago, JAD was the Java Decompiler everyone used. However, it's essentially a dead project now - if you still have it, you'll know it can only decompile Java 1.4 class files.

However, the good news is that a new Java decompiler has come along to take its place - JD. I used this on Windows before, but I'm now on Fedora 11 and I never installed this new Java Decompiler there.

Thanks to Ian's blog, it took me less than 5 minutes to install the JD Java decompiler and its GUI front end (JD-GUI) on my F11 64-bit box. I had to issue this yum command, as per Ian's instructions, to get some libraries installed needed by JD-GUI:

yum install libcanberra-gtk2.i586 PackageKit-gtk-module.i586 gtk2-engines.i586


I then just un-tar'ed the JD-GUI download and it ran fine.

I also took the time to install the JD-Eclipse plugin for my Eclipse 3.5. That worked flawlessly on my first attempt.

So now I have a nice Java decompiler on my F11 box, both as a standalone GUI and as an Eclipse plugin.

Sunday, January 3, 2010

Byteman Plugin Revisited

A couple of months ago, I came across Byteman and liked it so much, I wanted to see how easy it would be to manage it, so I wrote a quick RHQ agent plugin prototype for it and blogged about it.

Since I had some time over Christmas, I decided to revisit that prototype and build it out further. The finished product can probably be considered a good 1.0 version of a fully featured RHQ Byteman plugin.

With this RHQ-Byteman integration, you can now do remote byte-code injection into any Java virtual machine that has a Byteman agent running inside of it. Simply write your Byteman rules, store them in one or more scripts and file-upload those scripts from your browser to the RHQ Server. The RHQ Server will then stream down those script files to the remote Byteman agent, which then injects the code directly in its Java virtual machine. It is extremely easy to run a VM with Byteman inside of it - all you need is to pass a -javaagent VM argument to Java when you start it; something like:

-javaagent:/opt/byteman/lib/byteman.jar=listener:true


The uses for this kind of thing are boundless - from tracing how long it takes for a Java method to execute to testing an application's fault tolerance by forcing test exceptions to be thrown. In the future, I plan on attempting to write an extension to the Byteman plugin to allow me to trace JDBC calls and create reports based on the data I get back. In fact, work on that has already begun; the bulk of the RHQ plugin code is already written, I just have to figure out how to write the proper Byteman rules to get it to do what I want. Once I complete this "JDBC Trace" RHQ agent plugin, I'll blog/demo it.

If you are interested in this new RHQ integration with Byteman, watch the flash demo.

Note that in order to finish this Byteman plugin, I had to add some additional capabilities within Byteman itself. The bulk of the changes manifested themselves as new API methods to the Byteman client - the "Submit" class. These new features allow Byteman to be easily managed, not just by RHQ, but by any management tool that wants to integrate with Byteman. I committed this new code to Byteman's SVN repository which should make these new features available in its next release. Because the new Byteman is not released yet, nor available in the JBoss maven repository, I placed the RHQ Byteman plugin code in its own git branch. Once the next Byteman version is released, I'll merge my code into RHQ's master branch.