Home » 2012 » April

One Query to Inner Join or Left Join Based on a Flag

I ran into a situation today where the issue at hand was to I had to update a query from a Left Join to an Inner Join. One concern was that some of the consumers of the query needed the Left Join while others now needed an Inner Join of the data.

create table One (
    id int,
    noun vchar(5)
)

insert into One
values 
    (1, 'Car'),
    (1, 'Bus'),
    (1, 'Bike'),
    (2, 'Boat'),
    (3, 'Book'),
    (4, 'Sock'),
    (5, 'Rock')

create table Two (
    id int,
    verb vchar(10)
)

insert into Two
values 
    (1, 'Rolls'),
    (2, 'Floats'),
    (3, 'Reads...?')

select o.noun, t.verb
from One o
left join Two t
on o.id == t.id
where len(noun) = 4

The Obvious Solutions

There are a couple of solutions that come to mind first off.

  1. Create two procedures, One with an Inner Join and another with an Left Join.
  2. Modify the procedure to take a flag and switch between the two.

Two Separate Procs

The first solution is probably the worst. You not have nearly the exact same code in two places. Where the odds are that in the any future changes to either procedures will not be propagated between them. Likely causing an bug or unintended side effect.

--Proc One
select o.noun, t.verb
from One o
left join Two t
on o.id == t.id
where len(noun) = 4

--Proc Two
select o.noun, t.verb
from One o
join Two t
on o.id == t.id
where len(noun) = 4

If Not One Then the Other

The second solution is better because it keeps the code in the same logical place. That way if you update one you will likely think: “Hey, maybe I should update that select too”. Still, it is essentially the same code in two different places, be it in the same proc.

declare @flag bit
select @flag = 0

if @flag
begin
    --Proc One
    select o.noun, t.verb
    from One o
    left join Two t
    on o.id == t.id
where len(noun) = 4
end
else
begin
    --Proc Two
    select o.noun, t.verb
    from One o
    join Two t
    on o.id == t.id
where len(noun) = 4
end

We Can Do Better

So the next two things that came to mind were

  1. Take the results from the first select, the one with the Left Join, and using an “if” block select the rows in the result with out the nulls produced by the left join.
  2. Use some sort of logical selection in the query clause or a case statement to auto-magically switch between the two behaivors

Now the first option is not a bad solution. Given you know that the data set will always be small and that the cost of the second query is negligible in CPU and the memory.

Select “if / then” select again

declare @flag bit
select @flag = 0

select o.noun, t.verb
into #tempJoin
from One o
left join Two t
on o.id == t.id
where len(noun) = 4

if @flag
begin
    select *
    from #tempJoin
    where verb is not null
end
else
begin
    select *
    from #tempJoin
end

But…

But…

We Should Do Better

Usually any solution that involves “auto-magically” can be flagged right off as impossible, impractical, or a “bad idea” but something about the solution… if possible… seemed like it could be elegant, concise, and practical.

The answer lies in the select “if / then” select again solution. We have a query that returns the results we need in both cases. In one of those cases we just don’t need all of it and we have a if statement select out the nulls returned by the left join.

That logic of if “flag” then “remove the nulls” just has to be moved into the query. Typically there are three ways to do that.

One is to move the logic into a case statement. We could use the case to omit rows when they don’t match what we want but that would produce nulls. Again, what we don’t want.

The second and third are to embed the logic into the Join or the Where clauses of the statement. For the final solution both work but I chose to use the where clause because I believe it was more readable and simpler for other maintainers to understand.

So when you think about it you want to remove the row or keep the row based on the flag. The first thing you should think is AND. You want to mask the data or keep it so an AND will do the job. The next part of your logic will be to match what you want to filter out. That filter with the AND will turn your Left Join into an Inner Join. With that solved you have to find a way to switch that behavior on and off. The way to do that is to use an OR, not at the same level as your AND but inside with the filter. To ignore it, filter or true, or use it, filter or false. If your having trouble following that read though the code and look though the comments.

Both Left and Inner Join in One Select

select o.noun, t.verb
into #tempJoin
from One o
left join Two t
on o.id == t.id
--Our original where clause
where len(noun) = 4
--AND to filter
and (
    --This is our filter; remove nulls
    t.verb is not null 
    or
    --When this is true the filter is ignored
    --When this is false the filter works
    @flag = 1
)

So that, in a nut shell is it. You can switch between a Left Join and an Inner Join using a flag using only one select statement and no if blocks or case statements. The real advantage is that there is no code duplication across procedures or inside the proc it’s self.

Literal Strings C#

Something that frequently gets overlooked in C# are Literal Strings or Verbatim Strings. These are strings like any other except that none of the usual escape characters work. You may think, “Well that just sounds like a broke string”, but that’s the point.

Some times you don’t want a backslash to escape the characters in your string. For example when writing paths in Windows to a file or folder.

//This causes issues because S is not a valid escape
//and t is valid but you don't want to insert a tab in your string.
s = "C:SomePathThat you wantto use";
//Escaping the slashes hurts readability and is aggravating to update
s = "C:\Imagine\lots\of\sub\directories";

The same is true when trying to write regualar expressions. The symbols to represent digits (d) and and word characters (w) are not normally valid escape characters.

//Using d and w in a regular string will stop your code from building.
r = new RegEx("Jonny (d) is "(w+)"");
//Further more escaping the slashes in a RegEx can make the sting hard 
//to read or even more confusing to people who don't use RegEx-es.

The solution is to decorate the string with an ‘at’ symbol and turn it into a string literal or verbatim string. What that dose is ignore the slashes in your string so that they can be used literally. Ironic, i know…

//All of these strings hurt the eyes less, build and, in all 
//cases except the regular expression, no longer intimidate
s1 = @"C:SomePathThat you wantto use";
s2 = @"C:Imaginelotsofsubdirectories";
//Notice since '' no longer escapes we prefix the " with a "
//That's the only quark, i swear.
r = new RegEx(@"Jonny (d) is ""(w+)""");

Yeah, it’s a simple thing, but lots of people don’t seem to know it. So learn it, use it and make suggest others use it.