Before I begin, full disclosure; I believe in the value of what the hockey community calls ‘advanced stats’, even-though this post may seem to the contrary. This season my interest in the value of advanced stats was peaked when Patrick Roy, Coach of the Colorado Avalanche, held a press-conference, in October, dismissing the value of Corsi. Roy made a bunch of comments on the shortfalls of stats, like Corsi, and one or two were even convincing. As much as I disagreed with Roy’s approach, it got me thinking about the limitations of Corsi and other statistics that are emerging around the game of hockey. Brian Burke was infamously quoted in 2013, at the MIT Sloan Sports Analytics Conference, saying; “Statistics are like a lamp post to a drunk: Useful for support but not for illumination,” demonstrating – in addition to a lack of understand of where he was at the time – the reluctance many ‘hockey people’ have in seriously incorporating these statistics into their decision-making. Many, including Burke himself, still refer to hockey as an eye-ball business and are hesitant to use statistics as anything more than a complementary piece in the decision-making process. Instead, they would seemingly prefer to linger in the ‘ice-ages’ of hockey, rather than embrace a source of new information. That being said, I acknowledge that there are some limitations within the statistics; such as Corsi, and I’d like to address a few of those here.
For a start, in its’ modern form Corsi, also commonly referred to as ‘possession’, measures the number of shots, blocked shots, and shots that missed the net, for and against, to create a ratio to interpret which team has been driving the play. The core idea is that for all three of those actions the team must have the puck, and thus, the moniker of ‘possession’. Pundits and commentators typically use Corsi as a way to determine which team is outplaying the other, on the assumption that the better team will be attempting more shots at the net, and thus have the puck more. However, it’s worth that this was not the intention of Jim Corsi, the inventor of the Corsi statistic. Mr. Corsi, during his time as the Buffalo Sabres Goaltending Coach, used these events to track the amount of effort his goalie expended per game, and it helped him to determine when his starting goalie needed a night-off. Corsi’s belief, when he began tracking these events, was that for each of those attempted shots, whether they made it to the net or not, the goalie would have to react and expend effort. Thus, this was a better measure of the amount of work a goalie had to do during the game than simply relying on shots on goal. While this doesn’t necessarily demean the value of the modern interpretation of Corsi, it’s worth knowing that right from the get-go we’ve been using Corsi in a way other than that which was intended.
Anyways, the primary concern with Corsi – at least that Roy addressed in his press-conference – was that it doesn’t’ differentiate between the quality of the shots. While it’s admittedly a little unlikely that a team would deliberately attempt shots from poor angles, or shots with a poor chance of scoring with any level of regularity, consider the table below:
You can see that while Team B had two more scoring chances in this hypothetical period they were thoroughly out-Corsi’d by Team A. Roy may have a point here; many times when the average hockey fan consults Corsi they don’t look at the breakdown of events, and by simply consulting the box-score Corsi one can come to a misunderstanding about the real state of the game. This is a drastic example that I conveniently made up, but it illustrates a limitation of Corsi, and perhaps why Roy prefers to breakdown his games down by scoring chances.
Another concern – brought to me by a Teaching Assistant and Professor that apply quantitative analysis to hockey – is that two thirds of what Corsi measures, blocked and missed shots, have no chance of generating a goal. Again, it’s an exaggerated example, but the chart below illustrates just how misleading Corsi can be
Despite only mustering 70% of the shots on goal that Team A did, Team B still won out with the highest Corsi. If one just glanced at Corsi this might make you think that Team B has been the better team; however, Team A has generated more shots on goal, and thus had a higher potential to score, despite having fewer overall attempts as measured by Corsi. This is where the term ‘possession’ comes into play. Corsi measures the amount of shot attempts each team has had, and since each of these attempts requires the puck, measuring the number of events gives observers an idea of who has had the puck most, which is obviously a critical element in scoring. With Corsi you can get a good idea of which team has had the puck the most; yet, the shortfall is that shot attempts, and more specifically shots on net, are recorded as possession marking events regardless of quality. This inability to differentiate between quality scoring chances can easily lead to misinterpretations of Corsi, especially regarding which team has been generating the most potential offense.
Another factor that can significantly influence a team’s Corsi is score effects. Score effects are the way the statistical community measures the tendency of teams to sit-back and defend a lead, and the subsequent tendency of a losing team to take more shots in an effort to generate offense. Interestingly enough, this has been observed to be universal, and not a specific coaching tactic, seeing as score effects apply to every team in the league. In essence, score effects attempt to explain why, of the NHL games that end in regulation, the team that outshoots their opponents lose more than 50% of the time. The seems somewhat counter-intuitive, but it’s apparently true. To explain why, NHL Numbers put together a great chart illustrating how a game’s score affects the way teams to generate shots, using the NHL averages from 2007-2012, which I’ve included below:
While a leading team’s shooting percentage increases as their lead grows, likely due to exploiting high-risk chances taken by the trailing team, their shot share gradually shrinks. Conversely, a trailing team takes a greater volume of shots in an effort to score and tie the game. The trailing team’s shooting percentage stays the roughly the same, despite the score, suggesting that they gain no advantage in the quality of their shots, just volume. How this impacts Corsi is, that if a team falls behind early in a game, they are seemingly likely to outshoot, and thus out-Corsi their opposition, based on these numbers. The point I’m trying to make here is that Corsi, and more specifically shot generation, is affected by situational factors; such as the score of the game, and can be a misleading statistic if considered on a game-to-game basis. If a team builds a 2-0 lead in the first period, and maintains that for the duration of a game winning 2-0, they are likely to be outshot and out-Corsi’d, but would that mean it was a bad game?
Corsi is a great help in understanding which team has had possession of the puck in the offensive-zone, as each event that is counted requires the puck to be directed at the opposition’s net which is typically occurs in the offensive-zone, but it cannot discern the quality of that possession very effectively. It provides a good idea of which team has been attacking the net, even if their shots don’t make it to the opposing goaltender, and thus gives a more comprehensive understanding of the game, as opposed to relying on just shots on goal. However, it certainly isn’t without its’ flaws, and can be misleading for those who aren’t deliberately cautious when consulting a game’s box-score. While I wouldn’t advocate Montreal or Vancouver’s approach to analytics, ignoring them all together, I certainly am willing to acknowledge that they can be difficult to interpret correctly if one is not careful.
Brian Burke article: https://www.thestar.com/sports/leafs/2013/03/01/former_toronto_maple_leafs_gm_brian_burke_speaks_out_a_conference.html