The 1972 Canada-Russia Summit Series to some is a defining moment in Canada’s history. At the height of the Cold War, 28 hockey players went into a tournament thinking they it was an exhibition series that no one would take seriously. By the end, the 28 were defenders of Canadian hockey against the surprising hockey prowess and political power of the Russian bear. The final game, number 8 in the series of 8 was particularly dramatic. After a closely-fought first period, the Canadians fell 5-3 in the second, but came back to tie it in the third until 19:26 of the third period when Paul Henderson … well if you don’t know the story, you probably aren’t Canadian and don’t care anyway. Well, here’s the game on YouTube if you want to see more.
Game 8 is a good case for social network analysis centrality at work. A hockey game is a network where people pass a puck to and from each other over the course of 60 minutes. Each time the puck passes from one player to another, we can create a directed tie. We may also be able to make some statements about the game. For instance, is it more important to give or receive a pass from a diverse group of players? Who passes to the biggest passers? Who receives passes from them? Rather than going through all this preamble how about I just get to it?
Maybe later I will add the last names, but right now I’m going with the numbers. You can use this roster as a key to find your favourite players.
- 02: Gary Bergman
- 03: Pat Stapleton
- 05: Brad Park (if it weren’t for Bobby Orr, the greatest defenseman of his era)
- 06: Ron Ellis
- 07: Phil Esposito (big goal scorer and captain of the team)
- 08: Rod Gilbert
- 10: Dennis Hull (Brother to Bobby)
- 12: Yvon Cournoyer (The Roadrunner)
- 17: Bill White
- 18: Jean Ratelle (One of the great Rangers, amazing goal scorer)
- 19: Paul Henderson (A very good player, but the hero of the series)
- 20: Pete Mahovlic (Overshadowed by his brother Frank, but actually very good)
- 22: J.P Parise (Great player, but got thrown out early for threatening to slash the refs)
- 23: Serge Savard (Eventually became captain of the Habs)
- 25: Guy Lapointe
- 27: Frank Mahovlich (Big “M” – hero of the Toronto Maple Leafs)
- 28: Bobby Clarke (was chosen last for the team, but brought the Broad Street Bully element to the game.)
- 29: Ken Dryden
- 02: Alexandre Gusev
- 03: Vladimir Lutchenko
- 06: Valeri Vasilev
- 07: Gennadey Tsyganov
- 08: Vladimir Vikulov
- 09: Yuriy Blinov
- 10: Alexander Maltsev
- 12: Yevgeni Mishakov
- 13: Boris Mikalov
- 15: Alexander Yakushev
- 16: Vladimir Petrov
- 17: Valeri Kharlamov
- 19: Vladimir Shadrin
- 22: Vyacheslav Anisin
- 25: Yuri Liapkin
- 30: Alexander Volchkov
- 20: Vladimir Tretiak
Getting the Data Using R’s iGraph Library
The data were created using edge lists separated by spaces. Here is a sample of what it looks like:
A few things that may be added in the future are the times of the pass, goals, steals (although this could be calculated on its own), power-play information and so on. But for now, I just have the edge lists. The first entry is the “from” player (Rus=”Russia”, Can=”Canada” and Off=”Official / Referee”) and the second is the “to” player. You can enter the information in to an R graph object pretty easily using iGraph. You can assign descriptive values to the hockey players (vertices) by using V(df)$description. In this case, I’ve used color to easily identify the Russians from the Canadians in the graph plots (igraph will automatically plot the colors if there is a descriptor available).
el <- read.csv("summitseries.txt", header=F, sep="") #sep="" means any whitespace
df <- graph.data.frame(el) # create a graph from the dataframe el
#Create a color vertex trait so that Russians are red; Canada is white and the Refs are black.
V(df)$color <- ifelse(substr(V(df)$name,1,3)=="Rus",
"red", ifelse(substr(V(df)$name,1,3)=="Can", "white", "black"))
Degree refers to the number of different people a person passed/lost the puck to, or received/stole the puck from. It’s basically a count of the number of “sticks” for each ball.
The code to calculate the values is this:
V(df)$degree <- degree(df)
Each player gets a value based on the total number of pucks received or sent. To plot:
plot(df, vertex.size=V(df)$degree, layout=layout.kamada.kawaii)
This is what the graph looks like:
This graph is not particularly meaningful, but it does offer a few insights. For instance, Phil Esposito (#7) was out a lot in this game and managed to both take passes away from the Russians as well as lose them. It kind of speaks to his garbage can approach to hockey – his play in this period, like most days was gritty and he found himself in the midst of almost every play. This also shows quite a bit of the classy Russian style of play with a lot of quick passes and fancy footwork. Almost every player on the Russian team had the puck quite a bit. Vladimir Shadrin (#19) is mostly ignored in the English world today, but he was amazing in this series, scoring more than even the Russian hero Valeri Kharlamov (#17) who barely shows up on the charts.
V(df)$outdegree <- degree(df, mode="out")
“Out” degree is the same measure, but only counting “outgoing” passes. These represent passes made or intercepted.
Like I said earlier, Phil Esposito (#7) was finding himself giving the puck away quite a bit in this first period, but also making some pretty strong passes. Brad Park was also pretty busy. Both these guys happened to score goals in the period by the way. On the Russia side, Lutchenko (#3) and Yakushov (#15) are nothing particular special in the pass department even though they scored goals as well. That could be because the Russians were much more team players.
V(df)$outdegree <- degree(df, mode="in")
Not too much more to say about this one, except that it’s not too different from the outdegree measures. This is not that surprising given that if you have the puck either you are going to pass it or someone will steal it from you. I should also note that goalie Ken Dryden (#29) was pretty busy in this period. Not good for Canada.
Bonacich Power (Beta=0.5)
Now we can look at some eigenvector-like centrality measures. There are a variety of them, but I’ve decided to use Bonacich in this case. Bonacich uses a beta value that assigns a weight to the degree centrality of the neighbours. In the case of a positive value (cooperative networks), the more “passy” your neighbour, the more power you have. Unfortunately, this method produces both positive and negative values which is a little challenging for plotting. So I have a little linear mapping function that I borrowed from here:
linMap <- function(x, from, to)
(x - min(x)) / max(x - min(x)) * (to - from) + from
And then assign the values and plot.
V(df)$eigen <- bonpow(df, exp=0.5)
plot(df, vertex.size=linMap(V(df)$eigen, 0, 25)
The picture is a little bit different in this case. Now we see the great New York Ranger, Jean Ratelle (#18) finding his way into the largest influencer position along with Bobby Clarke (#28). On the Russian side, Yakushev (#15), Karlamov (#17) & Mishakov (#12) find themselves in their rightful place as the elite members of their team. Phil Esposito, on the other hand shrinks to almost nothing. Why? Well, he tends to find himself taking and losing the puck from defensive players more than picking up passes from his line-mates Yvon Cournoyer (#12) & Frank Mahovlich (#27).
Bonacich Power (Beta=-0.5)
V(df)$bonpow <- bonpow(df, exp=-0.5)
plot(df, vertex.size=linMap(V(df)$bonpow, 0, 25)
The picture is also quite different when looked at from a negative Bonacich power perspective. Usually negative bonacich power is used for networks that are competitive in nature, when it’s much better to have less powerful neighbours.
In this case, it’s pretty obvious that its the defensemen that have the least powerful as neighbours. This makes sense because defensemen usually end up playing with a wider variety of forwards than other forwards do. Canada’s top defenseman, Brad Park, certainly found himself passing to and from the lesser lines in the first period, and likely stealing from Russia’s lesser lines as well!
Conclusion (for now)
This post goes to show that you can get a different answer from a social network analysis depending on how you decide to measure it. There are no mind-blowing revelations here (likely because the game was somewhat even at this stage) but still quite a bit of diversity among the different graphs that it gives pause. At the end of the day, this is why it is important to think clearly about your research question before you start looking at your data. If you don’t, you’ll probably find yourself getting the answer you want just by rolling through different measures. I haven’t even gone through all the possibles – betweenness, closeness, clustering values and alpha centrality are all measures I’ve decided to leave out just for now (but may revisit later).
Another thing that might be interesting to look at is what the centrality values look like when I separate the Canadians from the Russians – from that perspective you could see how well the teams play with each other. Also, we could look at the edges where the puck changed hands from one team to another. In this case the negative bonacich power may be quite telling as per who was really coughing up the puck to the wrong people.
The data is not up in my git site yet, but I will share it eventually. I’ll keep the data open so that people can add or edit it as needs be. Certainly there may be problems with the way I coded everything. It was not always easy to see who was touching the puck. Sometimes I just had to guess based on position and the usual line-ups.