Thursday, January 30, 2014

Net Run Rate in Cricket and Simpson's Paradox

Here's what happens when a football fan with an understanding of numbers and a passing interest in paradoxes turns his mind to cricket. (I'm sure something similar was possible when goal average was used as a tie-breaker in football, before goal difference became the preferred method.)

Imagine a typical eight-team cricket tournament, where the teams are divided into two groups of four, and the top two teams in each group progress to the semi-finals. Final placings in each group are decided by points gained (two points for a win, one point for a tie or no result) with the first tie-breaker being net run rate.

After two rounds of matches in one of the groups, these are the results.

North 250/4 off 50 overs defeated South 230/8 off 50 overs by twenty runs.
East 101/2 off 14 overs defeated West 100/10 off 22.3 overs by eight wickets.
East 205/7 off 44.3 overs defeated North 200/10 off 47.1 overs by three wickets.
South 160/10 off 31.2 overs defeated West 150/10 off 46.4 overs by ten runs.

Remembering that when a team is bowled out it counts as fifty overs batted, and that the decimal equivalent of 44.3 overs is 44.5, at this stage, the points table is as follows:

M W L P RS OB RRF RC OF RRA    NRR
East 2 2 0 4 306 58.5 5.231 300 100 3.000   2.231
South 2 1 1 2 390 100 3.900 400 100 4.000 -0.100
North 2 1 1 2 450 100 4.500 435 94.5 4.603 -0.103
West 2 0 2 0 250 100 2.500 261 64 4.078 -1.578

(M = Matches, W = Wins, L = Losses, P = Points, RS = Runs Scored, OB = Overs Batted, RRF = Run Rate For, RC = Runs Conceded, OF = Overs Fielded, RRA = Run Rate Against, NRR = Net Run Rate)

After two rounds, South and North are tied on points, but South are ranked narrowly ahead of North as the result of their better net run rate.

The third round of matches is then played, with the following results.

West 295/6 off 48 overs defeated North 294/4 off 50 overs by four wickets.
East 129/9 off 47 overs defeated South 125/10 off 21.3 overs by one wicket.

East now have three wins and therefore win the group.

West have joined North and South on one win, so net run rate will determine which of the three teams will finish second and progress to the semi-finals.

In this last round of games, West’s net run rate is 0.266, bringing their average net run rate over the three matches to -1.186. This will see West finish last in the group.

In the third round, South’s net run rate was -0.245, slightly better than North’s -0.266. South were already ahead of North on net run rate after two rounds, so one would expect that South would finish above North in the final standings.

But no! Amazingly, the overall net run rate for South is -0.165 while North’s is slightly better at -0.163. So even though South had a better net run rate than North after two matches, and achieved a better comparative net run rate in the third round, North finish with a better net run rate overall and progress to the semi-finals.

M W L P RS OB RRF RC OF RRA    NRR
East 3 3 0 6 435 105.5 4.123 425 150 2.833   1.290
North 3 1 2 2 744 150 4.960 730 142.5  5.123 -0.163
South 3 1 2 2 515 150 3.433 529 147 3.599 -0.165
West 3 1 2 2 545 148 3.682 555 114 4.868 -1.186

This is an example of Simpson’s Paradox and shows how adding and averaging numbers can be fraught with peril.

What if instead of combining the scores from the three matches and then doing the net run rate calculation, we calculated each team’s net run rate compared to their opponents for each individual match and then added these together to create the NRR?

North’s run rates compared to their three opponents were 0.400, -0.607 and -0.266 for a combined total of -0.473.
South’s run rates compared to their three opponents were -0.400, 0.200 and -0.245 for a combined total of -0.445.

M W L P RS OB RRF RC OF RRA    NRR
East 3 3 0 6 435 105.5 4.123 425 150 2.833 6.066
South 3 1 2 2 515 150 3.433 529 147 3.599 -0.445
North 3 1 2 2 744 150 4.960 730 142.5 5.123 -0.473
West 3 1 2 2 545 148 3.682 555 114 4.868 -5.149

So using this method, South would finish above North.


My personal feeling is that this is a fairer method of calculating net run rate and more accurately reflects the reality of what happened during the three matches. It certainly does a good job of depicting the thrashing East handed out to West in the first match.

It may also have some negatives I haven't thought of. Perhaps some statistical experts could provide their input.



No comments:

Post a Comment