Jury Nullification

May 18, 2012 2:36 pm

Along with Bayesian reasoning, I think jury nullification is a topic that is important, yet widely unknown.  While somewhat controversial, I’ll present my viewpoint and why I think it’s important.  Before I start, I will note that I’m not a lawyer, and I’m limiting my discussion to criminal trials because the laws surrounding civil trials are different.

Borrowing from Wikipedia: “Jury nullification is a constitutional doctrine which allows juries to acquit criminal defendants who are technically guilty, but who do not deserve punishment. It occurs in a trial when a jury reaches a verdict contrary to the judge’s instructions as to the law.”  That is, a jury, though believing the defendant is guilty under the law, refuses to return a guilty verdict because they disagree with the law.

It certainly can be abused to allow criminals to go free who should be convicted, but I think far more important is that it can be used as another check and balance to protect the people from unjust laws.

What’s particularly interesting is that jury nullification isn’t something that can be outlawed without inherently jeopardizing the jury system at large.  What I mean is that jury nullification isn’t a thing that was specifically created, it’s more of a necessary side effect of having juries who can return verdicts without coercion.

In the United States, it is illegal to punish a jury member for the returned verdict.  I would hope that it is obvious why this law is necessary.  No jury could return an impartial verdict if they fear retribution.  Of course, there are laws against jury tampering as well, but the difference is that I could fear punishment without prior threat.  For instance, without this law one could be fired by one’s employer after serving on a jury because one’s employer didn’t agree with the verdict.  For juries to even pretend to be impartial they can’t be worrying about such consequences while deliberating.

In the United States, the other law necessary to create the opportunity for jury nullification is codified in the Fifth Amendment: “[N]or shall any person be subject for the same offence to be twice put in jeopardy of life or limb…”  Once a jury has returned a legitimate verdict the defendant can’t be retried for the same crime.  Without this law a judge could simply continue issuing a new trial until the desired verdict was returned–effectively stripping the jury of any power.

With these two laws, jury nullification comes into existence as a possible avenue of justice and injustice.

Personally, I consider the benefits of jury nullification to outweigh the risks.  But that may be due to my predilection for the idea that it is better to let 10 guilty persons go free than to convict 1 innocent person.  I realize that many people feel instead that it is better to convict 10 innocent persons than allow 1 guilty person to go free, but I think that’s rather sad (and perhaps a reason the U.S. has more prisoners than any other country in the world whether you measure per capita or absolute numbers).

The risk of allowing jury nullification is that like-minded individuals can allow criminal behavior by refusing to find defendants guilty when tried.  For example, an all-white jury of racists may refuse to convict a white person of murdering a black person.  This is clearly a bad thing and a failure of the justice system.  Yet this is entirely possible and legal because of the concept of jury nullification.

The benefit of allowing jury nullification is that like-minded individuals can protect themselves from an oppressive government.  For example, were the government to enact a law making it illegal to refuse a body search at airport security a jury could simply refuse to convict anyone tried under that law.  (What I find interesting is that this currently is law, yet the people that have broken it have never gone to trial.  I think the prosecutors are afraid jurors might refuse to convict and then airport security will look even stupider than they already do.  The most they’ve ever done is threaten a civil suit, which don’t “suffer” from jury nullification because in civil suits the judge can override the jury.)

Of course, jury nullification only works if citizens are actually given jury trials for their crimes.  So when the government reclassifies you from citizen to terrorist and hands you over to the military for indefinite detention without a trial you’re kind of out of luck.  I see this as a flaw rather than a feature.

Bayes’ Theorem

May 12, 2012 3:38 pm

By request, here’s my attempt to explain Bayes’ theorem (cribbing heavily from Wikipedia).

Deriving Bayes’ Theorem

We start with the standard definition of conditional probability (for events A and B):

3c5bf4874edc582346cb78a5eb66aaeb

Which reads: The probability of event A given the known occurrence of event B is equal to the joint probability of events A and B (e.g., the probability of both events occurring) divided by the probability of event B (assuming the probability of B is not zero).  I’m not going to show the derivation of conditional probability.

The summation axiom helps us understand the joint probability of A and B:

9e168c9112e06ac47b11fef388957692

It tells us that the joint probability of A and B is equal to the probability of A given B multiplied by the probability of B.  All we’re talking about is the likelihood of events A and B both occurring.

Keep the following equivalence in mind, we’ll need it in a minute.  It simply says that the order of variables when writing the joint probability is irrelevant.  It should be fairly straightforward that the likelihood of events A and B both occurring is the same as the likelihood of events B and A both occurring.

joint_equivalance

Filling back in our definition of conditional probability, we have (with the understanding that P(B) is not 0):

bayes_derivation

The third equation is the simplest form of Bayes’ theorem.  It wasn’t very hard to get to and the math, relatively speaking, is quite simple (we’re not talking about deriving the Schrödinger equation or anything absurd).  But its application, and understanding what it means, can be tricky.

P(A) is called the “prior,” representing our prior belief in the occurrence of event A.

P(A|B) is called the “posterior,” representing our belief in the occurrence of event A after (post) accounting for event B.

The remaining pieces, P(B|A) / P(B), are called the “support,” representing the support event B provides for the likelihood of event A occurring.

Applying Bayes’ Theorem

I’ll re-use the medical testing scenario from the previous post.

Substituting T to represent a positive test result and D to indicate the presence of the disease our equation becomes:

test_disease

We expand P(T) into P(T|D)P(D) + P(T|¬D)P(¬D) which is to say: The probability of getting a positive test, P(T), is equal to [the probability of getting a positive test when the disease is present, P(T|D), multiplied by the probability that the disease is present, P(D)] plus [the probability of getting a positive test when the disease is not present, P(T|¬D), multiplied by the probability that the disease is not present, P(¬D)].

So now we just need to map the numbers we have about the disease and the test to the variables in our equation:

P(D), our prior, is 1 in 200 million–the disease occurrence rate in the general population.

P(T|D), the probability of getting a positive test when the disease is present, is derived from the false negative rate.  When the disease is present, our test will only incorrectly say it is not 1 out of 200,000 times; so P(T|D) is 199,999 out of 200,000.

P(T|¬D), the probability of getting a positive test when the disease is not present is the false-positive rate: 1 in 100,000.

P(¬D), the probability of the disease not being present is simply the other 199,999,999 out of 200 million.

So let’s plug these numbers in step by step:

test_disease_computation

And we see that P(D|T), the probability that the disease is present given a positive test, is only 4.76%.  (My previous post incorrectly reported this as 0.05%, I’ve corrected the error.)

The car example I presented didn’t use hard numbers, but I’ll frame the concept into Bayes’ Theorem.  S will represent a car being stolen and H will represent a car being a Honda Civic:

stolen_honda

Which says, the probability that a car is stolen given that it’s a Honda Civic is equal to [the probability that a car is a Honda Civic given that it’s stolen] multiplied by [the probability of a car being stolen] divided by [the probability of a car being a Honda Civic].

The dealership was trying to push an insurance policy on me by only reporting P(H|S), but unless we account for the number of Honda Civics on the road in the first place, P(H), we aren’t getting the full story.

(Notice that we don’t have to actually care about P(S) if all we’re doing is comparing car make/models.  P(S) doesn’t change for the different make/models so it doesn’t affect the relative rankings.)

Bayesian Reasoning

May 10, 2012 4:10 pm

Bayesian reasoning (the application of Bayes’ theorem) is incredibly important, but virtually unknown (and not understood) in the general population.  While politicians, advertisers, and salespersons take advantage of this lack of understanding to extract votes and money; doctors, lawyers, and engineers can make life-threatening mistakes if they don’t apply the process correctly.  It’s a concept so vital to the proper interpretation of the world around us that I believe it should be a mandatory subject during high school.

Let’s start with an example to motivate the subject.  Let’s say you’re not feeling well and you go to your doctor.  Your doctor can’t determine what’s wrong and decides to run some blood tests.  The results come back and your doctor informs you that the test for a terminal disease came back positive.  Do you panic?  Probably.  Should you panic? Not necessarily.  There is vital information which we don’t know and need to know in order to understand the situation properly.

First, we need to know how accurate the test is.  We need to know the rates of type I and type II errors (“false positive” and “false negative”, respectively).  Let’s say you ask this question and your doctor tells you the test has a false positive rate of 1 in 100,000 and a false negative rate of 1 in 200,000.  One might believe that there is a 99.999% chance that you have this rare disease, but we need to interpret these error rates correctly.  This is where Bayes’ theorem comes in and to apply it we also need to know how rare the disease is.

Let’s say it’s an extremely rare disease, affecting 1 in 200 million people.  Knowing this you can update the probability that you have the disease to something more accurate.  Applying Bayes’ theorem we can calculate that a more accurate probability of having the disease is 4.8%.  That’s a pretty big difference.

There is a caveat though; we’re ignoring any compounding factors that led the doctor to run the test in the first place.  This calculation assumes that we had no particular reason to run the test, so we use the occurrence rate in the general population (1 in 200 million) in our calculation.  If you have specific risk factors that narrows the base group then the probability of having the disease will increase.

For example: let’s suppose that the doctor chose to run this test specifically because you have high cholesterol levels and a family history of diabetes.  Suppose that the disease occurs in roughly 1 in 1 million of people with those risk factors.  Now, instead of using the 1 in 200 million as our occurrence rate, we’d use the 1 in 1 million.  In which case the probability of having the disease rises to 9.1%.

Suppose another risk factor increases the likelihood from 1 in 1 million to 1 in 100.  Now the probability of having the disease shoots up to 99.9%.  It’s really important to understand what a positive result from a medical test actually means.  If you don’t understand Bayes’ theorem then you can wildly misinterpret reality and make some pretty serious mistakes.

Let’s do another example.  This one is a little easier to understand than the medical testing example.

When I bought my car in 2007, the dealership tried to tack on a charge for something which amounted to a small insurance policy which would pay out if the car were stolen.  The argument used to push this charge was that Honda Civics are the number one stolen car.  Of course, our questions are:  Is that true?  And does it mean what we think it means?

According to the Insurance Bureau of Canada’s 2006 list of the top ten most-stolen cars, the 2000, 1999, 1996 and 1994 Honda Civics took 4 of the 10 places.  Hmmm…sounds bad, huh?  But I’m sure you can guess by now that there’s a catch.

It’s vitally important to our analysis to know how many 2000, 1999, 1996, and 1994 Honda Civics exist in the first place and how many actually got stolen.  But we don’t have any of these numbers.  What we have is someone taking a list of stolen cars and adding up each type and declaring the Honda Civic as the most stolen car.  We need to know how many are stolen compared to how many of them exist.

Luckily for us, the Highway Loss Data Institute understands the difference and correctly reports the likelihood of a vehicle getting stolen by comparing the “‘theft claim frequency,’ which is the number of thefts reported for every 1,000 of each vehicle on the road.”  When we look at these numbers the picture changes entirely.  This top-ten list is filled with expensive cars like the Cadillac Escalade and several high-end pickup trucks.  The Honda Civic is nowhere to be found. 

If you don’t understand Bayes’ Theorem you can be manipulated into making bad decisions.  This applies all over the place in our lives.  It applies in our airport security procedures, our medical exams, our insurance decisions, political decisions, and our general level of fear about life.

You can read more about Bayes’ Theorem in its Wikipedia article.  I’m not going to try to teach it here (unless I hear a demand for it in the comments) because it’s not an entirely intuitive concept and it’s a little tricky to wrap your head around (which is why we get it wrong so often).

When good security is a problem itself

April 11, 2012 2:10 pm

NPR’s article, “Spate Of Bomb Threats Annoys Pittsburgh Students” got me thinking about the unintended consequences of implementing good security.  Even ignoring the other issues involved like civil rights violations and creating easily attacked lines.

Reacting to every threat has at least two detrimental effects: denial of service and complacency.

The first, and most immediate, is the ability for an adversary to shut down a system without doing anything but writing a letter, making a phone call, or posting something on the Internet.

In computer security we call this type of attack a denial of service (DOS) attack.  With a computer it is usually achieved by making legitimate requests at such a frequency as to bog down the machine and prevent it from responding to normal users.

In this case, however, it’s making threats and forcing law enforcement to respond.  This has two effects.  The first is that it takes law enforcement away from legitimate calls (denying those people of the service of law enforcement).  The second is when law enforcement responds by shutting down or drastically reducing the functionality of the threatened target (denying service to customers of that target).

In the article the students are queued up waiting to go through a security checkpoint in order to get on campus.  In airports they might clear the gates and require everyone to go through security again.  In either case massive amounts of time and money are wasted.  The attacker has done nothing, but still managed to mess with their target.

In this manner terrorists could cause billions of dollars in losses to our economy simply by calling in threats to airports, shopping malls, schools, stadiums, etc.  And given our level of unwarranted fear, what law enforcement agency is going to do nothing when they receive a threat like that?  If they’re wrong no one will listen to arguments about likelihood, corroborating evidence, etc.

The second detrimental effect is complacency or “the boy who cried wolf” effect.  One technique used to bypass an alarm system is to repeatedly trip the alarm, but do nothing else.  Eventually the people responding to the alarm may begin to delay responding presuming it’s another false alarm.  Or in the best case (from the attacker’s view) they may turn off the alarm altogether.

If they do continue responding to the alarm then they’re faced with a dilemma: How many times do you respond to an alarm at cost $X per response before you can no longer afford to respond?  How many airports do you shutdown and flights do you cancel before the airlines begin going bankrupt or flying becomes so unreliable people just stop trying?

In the case of the school in the article, University of Pittsburgh, how many more of these threats are they going to evacuate buildings and run security checkpoints for before the students start leaving looking for schools that actually have time for education?

These are two of the problems that exist from treating every threat seriously and not using risk management techniques to handle threat response.  But given that everyone involved would be fired, if not prosecuted, if they were wrong, what alternative do they have?

If we shut down our society because we’re afraid then haven’t the terrorists won without ever doing a thing?