The sports business analytics class is going to learn cluster analysis tomorrow, specifically k-means cluster analysis. The lectures lately discussed some methods of data analysis, ways to go about adjusting the data and some basic linear fit. It seemed a logical extension to go further and look at clustering.
While we look at a few types of data for tomorrow we will look at salary and points. We will look at these in levels as well as differences from the five year average for all teams in the NHL. Our discussion of the five year time frame centered on the smoothing aspects we might be able to see from that length of data.
Tomorrow’s discussion will focus on setting up the number of clusters but for right now I am starting at four.
This graph is for the level of points and salary. It is somewhat interesting how these group together. Now the most interesting aspect of this is the comparison with the difference from the average levels for all teams over the five years.
There are different groupings here as a result. Some members of the groups stay the same while others are different. Personally I cannot wait to turn the students loose on this and have them discuss.