Witam.Otrzymałem ostatnio teks na temat gry under/over.Jednak nie potrafie tegoz bytnoi rozszyfrowac i prosze Was i pomoc o co konkretnie chodzi w tym tekście wkleje tekst po angielsku i przetłumaczony przez translator na język polski .Predicting the Final Score
Gilbert W. Bassett Jr.
June 1996
Abstract: In the usual model for rating teams, the outcome of a pairwise contest is represented by
the difference in the relative strength of the teams. In this paper the standard model is extended to
account for the total points scored by each team. The new model can be used to predict not only
that the Cowboys are three points better than the Bears, but that the final score will be 27-24.
Besides being more informative about the outcome of the game, this provides an estimate of the
total points scored by both teams, the so-called over/under. The method also yields a
decomposition of each team's relative strength into offensive and defensive components. The
method is illustrated for NFL teams in 1993.
Keywords: Ratings, Least Squares, Least Absolute Errors, Point Spreads
University of Illinois at Chicago
Department of Economics(m/c144)
601 S. Morgan St. RM 2103
Chicago, Illinois 60607-7121
E-Mail:
gib@uic.edu 1. INTRODUCTION
In the conventional statistical model for rating teams the final score of a game is represented as
the difference between the opponents rating parameters. Descriptions and applications of models
for team ratings can be found in Leake(1976), Stefani(1980), Harville(1980), Stern (1992, 1995),
Keener(1993), Wilson(1995), and Bassett(1996). With the standard model, a 3 point difference
between the Cowboys and Bears means the Cowboys will likely win by three, but it does not say
whether a 10-7 outcome is more likely than 24-21.
The purpose of this note is to describe a variation of the usual model that permits
predictions of the final scores for each team. The model can be used to predict the total points
scored in a contest, which is known as the "over-under". As a bonus it also yields a
decomposition of a team’s overall strength into offensive and defensive components.
The next section describes the model. Section 3 presents an application using data on
NFL teams during the regular 1993 season. Estimates are provided using both least squares (L2)
and least absolute values (L1). Section 4 discusses features of the ratings. One feature concerns
the ratings relation to “normalized” scores, that is, a team’s score after controlling for the home
filed advantage and the quality of the opponent. The least squares estimate for the offensive
parameter is the average of a team’s points scored, controlling for the quality of the opponent
defenses, while the defensive parameter is the average of points allowed after allowing for the
quality of opponent offenses. The L1 rating is determined analogously except that the “average”
is replaced by the median. Also considered is the combination of the offensive and defensive
ratings into a single measure of overall strength. This derived measure is compared with the
estimates obtained from usual model based on point differences.
2. THE MODEL
Teams are indexed, t=1,...,T, and games g=1,...,G. Each game has two teams, home and away,
identified by hg and ag. Let S(hg) denote the score of the home team in game g, and let S(ag)
denote the away team's score in the gth game. The difference in the final score is, Dg=S(hg)-S(ag).
The home field advantage represents the additional points scored by the home team
compared with what it would have scored if the game had been at a neutral site. The home field
advantage is denoted by h0.
In the usual rating model one rating parameter, Rt, is associated with each team; it
corresponds to a team's strength compared with other teams. Since the ratings are derived from
score differences, it will be called the point spread model; for discussion of point spread betting
markets see Bassett(1981). The difference in the final score of game g is given by,
POINT SPREAD MODEL
Dg= h0 + Rh(g) - Ra(g) + g. (2.1)
Estimates of the rating parameters are based on score differences, {Dg, g=1,...,G}.
This standard setup contrasts with the "final score" model in which two parameters are
assigned to each team, one for offense and one for defense. In this case the dependent variable
corresponds to the final score of each contestant, S(hg) and S(ag).
The offensive parameter for team t is denoted by OFFt and the defensive parameter is
denoted by DEFt. The offensive parameter measures a team's ability to score points. In the
football context, where teams have offensive and defensive units, the offensive parameter will be
correlated with the offensive unit's ability, but the parameter most accurately reflects the team's
ability to score points, even if it is because points are scored by the defensive unit or because a
superior defense leaves the offense in favorable scoring positions. Similarly, the defensive
parameter represents the ability to limit points scored by an opponent. The defensive parameter is
correlated with the strength of the defensive unit, but might also reflect a superior offensive unit
that either leaves the opponent far from the goal line, or is on the field for a long time thus
leaving the defense rested.
The model for the total points scored by each team is given by,
FINAL SCORE MODEL
S(hg) = h0 + OFFh(g) - DEFa(g) + eh(g) g=1,...,G
(2.2)
S(ag) = OFFa(g) - DEFh(g) + ea(g) g=1,...,G
This says that the points scored by the home team is equal to the home field advantage, plus its
offensive rating, minus the opponent's defensive rating, plus a random term. The score of the
away team is similar, but excludes the home field factor. The random term can be thought of as
accounting for the "breaks", "bounces of the ball", and other game specific factors that affect
final scores1.
This is a standard linear model. It has 2G observations and 1+2T parameters: one for the
home field advantage, plus T offensive and defensive parameters.
Remarks: 1.The model (2.2) only determines ratings up to a constant term. (If
(OFFt,DEFt), t=1,...,T satisfy (2.2) then so do, (OFFt+a,DEFt+a), t=1,...,T, where a is arbitrary).
To anchor ratings and make them unique it is convenient to add a pseudo (2G+1)st observation
that specifies the value of one parameter, say, 0=DEF1+e2G+1. The effect of this additional
1It will be assumed for simplicity that the error terms for the two teams playing a game are
independent. This simplifying assumption means that there is not a common factor to realizations
of the error terms. It is easy to think of situations where this is doubtful. Poor weather conditions,
for example, usually produce lower than expected scores for both teams thus resulting in
correlated errors.
3
observation is to force DEF1=0, with all other ratings uniquely determined relative to DEF1. (The
estimated error at the pseudo observation--based on any method of minimizing errors--will be
zero; if it were a0 then subtracting "a" from each offensive and defensive parameter estimate
would reduce the error at the pseudo observation to zero without changing the fit at any other
observation, thus contradicting e2G+1=a0).
2.To write (2.1) in linear model form, partition the vector of dependent variables into,
[S(h1),...,S(hG)|S(a1),...,S(aG)]. The first column of the design, representing the home field
advantage, is then given by, [1,...,1|0,...,0]. Partition the remaining parameters so that all
offensive parameters come first; the (column) parameter vector is,
[OFF1,...,OFFT|DEF1,...,DEFT]'. Each row of the design (besides the first column) will then have
a “+1" in the column corresponding to the team's offensive parameter and a “-1" in the column
corresponding to its opponent's defensive ability. The partitioned design is
X11 X12
0 X21 X22
where is a vector of ones and (i) X11 is GxT with rows {xgt} with xgt=+1 if t=h(g) and 0
otherwise, (ii) X12 is GxT with rows {xgt} with xgt=-1 if t=a(g), and 0 otherwise, (iii) X21 is GxT
with rows {xgt} with xgt=+1 if t=a(g), and 0 otherwise (iv) X22 is GxT with rows {xgt} with xgt=-1
if t=h(g), and 0 otherwise. Notice that X11=-X22 and X12=-X21.
3. RATINGS ESTIMATES
Table 1 shows least squares (L2) estimates for the offensive and defensive ratings for NFL teams
based on games played during the 1993 regular season. There were G=224 games and there are
T=28 teams. The estimates are therefore based on 448 observations and there are 56+1=57
parameters, including the home field advantage. The estimates are scaled so that SF's offensive
and defensive parameter sum to 100. The table shows (i) won/loss records, total points
scored(PF), and total points allowed (PA) for each team, (ii) ratings and ranks for each team's
offensive and defensive parameter, and (iii) the sum of the offensive and defensive ratings, which
is a measure of a team's overall relative strength.
The table shows that the offensive and defensive ratings for a team are sometimes very
different, a detail obscured when only a single rating estimate for a team is constructed. For
example, the Bears (CHI) had the 4th best defense but only the 25th best offense. On the other
hand, the eventual superbowl winning Cowboys (DAL) had the best offense and the fourth best
defense.
4
To illustrate how the estimates can be used to predict final scores, consider the first
playoff game between the Raiders (LAA) and the Broncos(DEN). The predicted final score
would have been 24-21 in favor of Denver. The predicted score (2120.8) for the home team
Raiders is the sum of the home field advantage (2.8), plus the Raiders offensive rating(54.9),
minus Denver's defensive rating(36.9). The score for Denver comes from its offensive rating of
58.5 minus the 34.8 defensive rating of the Raiders. This prediction did not turn out too well as
the Raiders beat the Broncos 42-24.
On the other hand, the prediction for the playoff game between the Giants (NYG) and the
Vikings (MIN) turned out better. The prediction was 18-10 favoring NYG (SNYG=18.2=2.8+53.1-
37.7 and SMIN=10.5=52.5-42). NYG won 17-10.
The Superbowl (played at a neutral site) had a predicted final score of 19-14, Dallas over
Buffalo (SDAL=58.7-39.6, SBUF=55.6-41.3). Dallas won 30-19.
For comparison purposes, Table 2 presents the offensive and defensive ratings estimated
by least absolute errors (L1); see Bassett and Koenker(1978) and Bassett(1996)2. (The ratings are
scaled so that the L1 and L2 estimates for OFFDal are equal).
The table shows L2 and L1 do not always agree. A few examples: (i) the Bills’ (BUF)
defense ranked 5th according to L2, but only 17th according to L1; (ii) the 49ers (SF) defense
ranked 16th by L2 , but 5th by L1 and (iii) the Cowboy (DAL) offense was second by L2 but
seventh with L1. The different estimates also lead to different predicted final scores. For the
superbowl, L1's predicted score was 20-13. As explained in Bassett(1996) the differences can be
traced to the fact that L2 is based on the average and L1 is based on the median statistic.
4. DISCUSSION
Normalized Scores
It was previously shown that for the point spread model (2.1), there is a simple relation
between the rating estimates and normalized scores; Bassett(1996). A normalized score is an
estimate of Dg, controlling for home field advantage and the quality of opponent. A team’s least
squares rating is the average of it normalized scores, while the L1 rating is the median of
normalized scores.
2For the rating design matrix, the L1 estimates will not generally be unique. To obtain unique
estimates it is necessary to slightly perturb the design matrix by reweighting observations. The
unique estimates in Table 2 were obtained by weighting each game by (1+d*w) where w is the
week of the season and d=.00001. The effect of this weighting is make the estimates unique and
give recent games slightly more influence in determining the estimates; see Bassett(1996).
5
For the final score model (2.2) there is an analogous relation between the estimates and
normalized scores. Now, however, it is a normalized offensive score that controls for the
defensive ability of the opponent, while the normalized defensive score controls for offensive
ability of the opponent. It can be shown from the first order conditions for the estimate that a
team's L2 offensive rating is equal to its average points scored per game--after normalizing for the
home field advantage and the opponent's defensive strength. Similarly, the defensive rating
corresponds to average points allowed, normalized by the home field advantage and the
opponent's offensive ability. The same thing holds for the L1 estimate except that the average
statistic is replaced by the median. The proof is a straightforward extension of the corresponding
property for the model (2.1); see appendix Bassett(1996).
20 minus 13 Equals 10?
Suppose your best guess for the final score is 20-13. Does it follow that your best guess
about the difference in the final score will be 7 points? Or could a reasonable point spread
estimate be 10 points when the final score estimate is 20-13. Does the best guess about the
game’s final score have to translate into a best guess about the point spread?
To see how this relates to final scores consider the difference S(hg)-S(ag) where scores are
determined by (2.2). Rearranging terms gives,
S(hg)-S(ag)= Dg = h0 + [OFFh(g)+DEFh(g)] - [OFFa(g)+DEFa(g)]+ [eh(g)- ea(g)].
This says the difference in the final score is the sum of the home field advantage and the
difference in (i) a composite term for the home team, OFFh(g)+DEFh(g), and (ii) a composite term
for the away team, OFFa(g)+DEFa(g). This is exactly how the point spread model (2.1) works,
except that relative strength is here expressed in terms of separate parameters for (OFFt,DEFt)
(instead of a single parameter Rt) and the data is disaggregated to {S(hg), S(ag)} (instead of score
differences, {Dg}).
Let the estimate of overall strength based on the offense and defense parameter estimates
be denoted by R*t=OFFt+DEFt. Contrast this with the estimate of relative strength, call it R't,
obtained from the standard model (2.1) for the score differences, {Dg}.
The estimates for relative strength based on (2.1) are presented in Table 3; these were
previously considered in Bassett(1996). The table first shows the L2 estimates for R't alongside
R*t . As can be seen, the estimates are identical. It can be shown that this will be necessarily the
case: the L2 estimate for relative strength based on model (2.1) and data Dg will be identical to the
estimates derived by summing the OFFt and DEFt estimates based on the model (2.2). This
identity follows from the linearity of least squares. It means that when least squares says the final
6
score will be 20-13, it will also predict a point spread of 7 points.
Table 4 shows the L1 estimates based on the point spread model. Unlike least squares we
see that neither the ratings not the associated rankings match those based on OFFt+DEFt. For
example, SF is top-ranked based on the sum of OFFt and DEFt, but only ranked fifth when the
estimation is based on (2.1). A consequence is that a predicted score does not translate into a
prediction for the difference in the final score. In fact the L1 final score estimate for the
Superbowl was 20-13, even though the L1 point spread had Dallas favored by 10.
This feature of the L1 estimates might seem strange. To see the same thing in an
analogous situation consider estimating the location parameters of random variables W and Z.
Now consider estimating the difference in the location parameters of W and Z. Without
additional information or imposing restrictions on the estimates, there is no reason for the
difference in the estimates used for the first problem to equal the estimated difference in the latter
problem.
The equivalence between the least squares final score and point spread estimates can be
traced to its being a linear estimator based on "expectations" or "averages". In particular, the
identity reflects the property that the average of a difference is the difference of the averages. A
least squares estimate of 20-13 says, in essence, that the Cowboys will, on average, score 20
points against the Bills, and the Bills will score 13 points on average against the Cowboys. It
follows from the linearity of the expected value that the Cowboys will on average score 7 more
points than the Bills.
A 20-13 predicted final score based on L1 however derives from the median, and the
median is not a linear estimator. The L1 predicted score means, in essence, that it is 50-50 for the
Cowboys to score 20 points (half the time more than 20, half the time less than 20), and the 13
estimate means it is 50-50 that the Bills will score 13 against the Cowboys. Since the median of a
difference is not equal to the difference of the medians, it need not be 50-50 for the Cowboys to
win by seven. In fact, based on score differences L1 has the Cowboys favored by 10.
The difference in final scores can be constrained to equal the final score difference by
including a constraint in the estimation problem associated with the model (2.2). Or, an
estimation method like L2 can be used in which the constraint is automatically satisfied.
Alternatively, the final score and point spread ratings can be estimated separately using a
nonlinear method in which case the best guess about the point spread need not be the same as the
difference in the final score.
5. SUMMARY
7
In the usual model for rating teams the outcome of a pairwise contest is represented as the
difference in the team's relative strengths plus a random error. This gives predictions of the
difference in the final scores and leads to team ratings. This paper has shown how to estimate the
separate final scores of the two teams. Properties of estimation methods were discussed and
ratings were illustrated for the 1993 pro football season. Besides being more informative about
the outcome of the game, the final scores provide an estimate of the total points scored by both
teams as well as a decomposition of relative strength into offensive and defensive components.
8
References
Bassett, Gilbert W.(1981). Point Spreads vs. Odds, Journal of Political Economy, v.80, n. 4,
752-768.
Bassett Gilbert W.(1996). “Robust Sports Ratings Based on Least Absolute Errors”. Manuscript
Bassett, Gilbert W. and Roger Koenker (1978). The Asymptotic Theory of Least Absolute Error
Regression, Journal of the American Statistical Association, Vol.73, No. 363, September,
618-622.
Harville, David(1977). "The Use of Linear-Model Methodology to Rate High School or College
Football Teams", Journal of the American Statistical Association, Vol.72, 278-89.
Harville, David(1980). "Predictions for National Football League Games With Linear-Model
Methodology", Journal of the American Statistical Association, Vol.75, 516-524.
Harville, David A. and Michael H. Smith (1994). "The Home-Court Advantage: How Large and
Does It Vary From Team to Team". The American Statistician, v.48, n.1, p.22-28.
Keener, James P.(1993). "The Perron-Frobenius Theorem and the Ranking of Football Teams".
SIAM Review, v.35,no.1, pp.80-93.
Koenker, Roger and Gilbert W. Bassett Jr.(1978). "Regression Quantiles", Econometrica, Vol.
46, No. 1, January, 33-50.
Leake, R. J. (1976). A Method for Ranking Teams with an Application to 1974 College
Football". Management Science in Sports. North Holland.
Stefani, R. T.(1977). "Football and Basketball Predictions Using Least Squares", IEEE
Transactions on Systems, Man, and Cybernetics. February, p.117-121.
Stefani, R. T.(1980). "Improved Least Squares Football, Basketball, and Soccer Predictions",
IEEE Transactions on Systems, Man, and Cybernetics. v. SMC-10, n.2, February, p.116-123.
Stern, Hal (1992). "Who's Number One?-Rating Football Teams", Proceedings of the Section on
Statistics in Sports 1992, p.1-6.
Stern, Hal S. (1995). “Who’s Number 1 in College Football?... and How Might We Decide?,
Chance, v.8,n.3, p.7-14.
Wilson, Rick L. (1995). “The “Real” Mythical College Football Champion”. OR/MS TODAY,
October, 1995, p. 24-29.
9
Table 1
1993 NFL Standings
OFFENSIVE and DEFENSIVE Ratings
Based on L2
OFF DEF OFF+DEF
W L PF PA Rating Rank Rating Rank
DAL 12 4 376 229 58.7 2 41.3 2 100.0
HOU 12 4 368 238 57.9 4 39.6 4 97.6
BUF 12 4 329 242 55.6 9 39.6 5 95.2
KC 11 5 328 291 55.7 7 37.5 11 93.3
NYG 11 5 288 205 53.1 19 42.0 1 95.0
SF 10 6 473 295 63.8 1 36.3 16 100.0
LAA 10 6 306 326 54.9 11 34.8 21 89.7
DET 10 6 298 292 53.5 16 36.5 15 90.0
DEN 9 7 373 284 58.5 3 36.9 13 95.4
MIA 9 7 349 351 57.7 5 32.2 26 89.9
GB 9 7 340 282 56.5 6 37.1 12 93.6
PIT 9 7 308 281 53.9 15 38.0 8 91.9
MIN 9 7 277 290 52.5 20 37.7 9 90.2
SD 8 8 322 290 55.0 10 37.7 10 92.6
NO 8 8 317 343 54.6 13 34.4 22 89.0
PHA 8 8 293 315 54.6 14 36.2 17 90.8
NYJ 8 8 270 247 51.7 21 39.4 6 91.2
PHX 7 9 326 269 55.7 8 38.6 7 94.3
CLE 7 9 304 307 53.2 17 35.9 19 89.2
CHI 7 9 234 230 49.3 25 40.9 3 90.1
ATL 6 10 316 385 54.6 12 31.3 28 85.9
SEA 6 10 280 314 53.1 18 36.0 18 89.1
NE 5 11 238 286 49.8 24 36.8 14 86.6
TB 5 11 237 376 50.6 22 32.2 25 82.8
LAN 5 11 221 367 48.8 26 33.2 24 82.0
WAS 4 12 230 345 50.1 23 33.8 23 83.8
IND 4 12 189 378 47.6 27 31.5 27 79.1
CIN 3 13 187 319 46.4 28 35.5 20 81.9
10
Table 2
L2 and L1
OFF, DEF Ratings
Offense Defense
L2 L1 L2 L1
TEAM Rating Rank Rating Rank Rating Rank Rating Rank
ATL 54.6 12 55.7 12 31.3 28 30.8 28
BUF 55.6 9 57.4 10 39.6 5 38.4 17
CHI 49.3 25 48.4 26 40.9 3 43.1 2
CIN 46.4 28 50.1 24 35.5 20 37.1 21
CLE 53.2 17 54.1 20 35.9 19 36.1 23
DAL 58.7 2 58.7 7 41.3 2 44.4 1
DEN 58.5 3 65.0 2 36.9 13 37.7 18
DET 53.5 16 57.0 11 36.5 15 38.4 16
GB 56.5 6 58.4 8 37.1 12 40.1 7
HOU 57.9 4 61.7 3 39.6 4 38.8 13
IND 47.6 27 47.4 28 31.5 27 31.7 27
KC 55.7 7 58.7 6 37.5 11 39.7 9
LAA 54.9 11 59.0 4 34.8 21 35.1 24
LAN 48.8 26 49.8 25 33.2 24 36.7 22
MIA 57.7 5 59.0 5 32.2 26 33.1 26
MIN 52.5 20 54.4 17 37.7 9 41.1 4
NE 49.8 24 51.1 23 36.8 14 38.8 14
NO 54.6 13 55.4 13 34.4 22 37.1 20
NYG 53.1 19 55.1 15 42.0 1 42.4 3
NYJ 51.7 21 52.4 22 39.4 6 39.4 10
PHA 54.6 14 52.7 21 36.2 17 37.4 19
PHX 55.7 8 57.7 9 38.6 7 39.8 8
PIT 53.9 15 54.1 19 38.0 8 40.4 6
SD 55.0 10 55.1 14 37.7 10 39.1 11
SEA 53.1 18 54.7 16 36.0 18 38.8 15
SF 63.8 1 66.1 1 36.3 16 41.0 5
TB 50.6 22 54.4 18 32.2 25 33.8 25
WAS 50.1 23 48.4 27 33.8 23 38.8 12
11
Table 3
L2 and L1 Team Ratings
L2 L1
TEAM OFF+DEF TEAM OFF+DEF TEAM
DAL 100.0 100.0 103.1 100
HOU 97.6 97.6 100.5 97.5
BUF 95.2 95.2 95.8 90
KC 93.3 93.3 98.4 93.5
NYG 95.0 95.0 97.5 91
SF 100.0 100.0 107.1 93
LAA 89.7 89.7 94.1 87.5
DET 90.0 90.0 95.5 90
DEN 95.4 95.4 102.8 94.5
MIA 89.9 89.9 92.2 84.5
GB 93.6 93.6 98.4 85
PIT 91.9 91.9 94.5 87
MIN 90.2 90.3 95.5 85.5
SD 92.6 92.6 94.2 96
NO 89.0 89.0 92.5 79.5
PHA 90.8 90.8 90.1 85.5
NYJ 91.2 91.2 91.8 85
PHX 94.3 94.3 97.5 91.5
CLE 89.2 89.2 90.1 82
CHI 90.1 90.1 91.5 83
ATL 85.9 85.9 86.5 79
SEA 89.1 89.1 93.5 85
NE 86.6 86.6 89.9 85.5
TB 82.8 82.8 88.1 82
LAN 82.0 82.0 86.5 71.5
WAS 83.8 83.8 87.2 79
IND 79.1 79.1 79.1 74
CIN 81.9 81.9 87.1 81.5