Can We Simplify Blast Motion?
For teams wanting to get more granular than batted ball metrics when analyzing their hitters, Blast Motion can be a great tool. While Blast does a nice job explaining what each metric means and how it can be impactful, the plethora of metrics they provide can be overwhelming to those who are just getting started with this tool.
For teams that have player development departments, every metric is important since they need to be able to see exactly where a is underperforming and where it is just right. Knowing exactly what is going on will obviously influence a player’s development plan. However, when it comes to evaluating players, a lot of the metrics are redundant. A team may be curious about a player’s power ability, but they may believe that power is power and five metrics capturing power in slightly different ways may not be as meaningful to them. In this case having one metric combining all other power metrics may be a more viable way of looking at a player and could be more user friendly for non-technical employees.
With that said, today the focus of our article will be on helping out both player development and player evaluation departments with Blast Motion metrics. We will be looking at ways to reduce the amount of variation among Blast metrics. This can be done through something called principal component analysis (PCA). PCA is a helpful technique when we have a large number of variables, that may be correlated, that we want to try reducing down into a few different independent numbers. By doing so we can see where Blast metrics are similar, which helps coaches and development staff, but also helps teams know they could simplify player evaluation.
For this project there are two different data sets. One is of a mid-major DI baseball team. That data has a total of 43,904 swings and 18 players. The least a player had was 495 swings and the most was 6,254. For this project I removed tee swings and left all other environments. Sometimes these players do specific drills on a tee that I’ve noticed may skew their metrics. The other data set being used is one from a team in the Cape Cod League last summer. That one has a total of 13,069 swings and 15 players with the least amount being 362 for a player and the most being 1,318.
Before I proceed, I just want to note in no way is this article about what metric to look at or not to look at. Other articles with more data do a good job of that. The aim of this article is just to see if a combination of the many existing Blast metrics can be boiled down into one or two numbers.
Our first goal is to see if it is even possible to reduce Blast Motion metrics down. It’s possible this theory is wrong. However, as mentioned, if right then player evaluation departments could begin to simplify the process of looking at Blast data. What we’re looking at above is a plot showing how much each dimension (combination of various metrics) contributes to explaining the variance in the data. What we see is that two dimensions, or numbers the data can be boiled down into, explain about 50-55% of the variance in the data and three explains about 70% of the variance. This is found by adding up the number on each bar. The top graph is for the D1 metrics and the bottom is for the Cape Cod metrics.
So, what exactly are these dimensions? As mentioned they’re linear combinations of the data. Essentially we’ve combined a bunch of similar metrics into one number that tries to capture all these metrics. That number isn’t aimed at picking the metric that is most predictive of success. For example, it doesn’t weight bat speed heavily. It just is meant to reduce the variance of similar metrics to one number. Since we said three dimensions explain about 70% of the variation in the data, we should take a look at what those dimensions are made up of.
Our first dimension, the one explaining the most variation in the data, seems to focus on what a lot of people consider Blast’s power metrics. Variables that are correlated with our first dimension are the most important to explaining the variability in the data. The red dashed line shows an expected average contribution in these graphs, so anything below that line we know doesn’t contribute much. Based off of this, Time to Contact, Power, Rotational Acceleration, Peak Hand Speed and Bat Speed make up our first dimension. These are also all known to relate to exit velocity in varying amounts. This is encouraging, since it gives us an idea that our hypothesis of reducing these power metrics into one number may be possible. While there’s slight differences between the D1 team’s data (left) and Cape Cod team’s data (right), it is also reassuring to see they still generally lineup.
The second dimension of the data can be thought of as more of a contact before impact dimension. This dimension features Connection Differential (Early Connection – Connection at Impact), On Plane Efficiency and Early Connection. Surprisingly, Bat Speed and Power did play a role for the D1 data (left) and Attack Angle mattered for the Cape Cod data (right).
Our third and final dimension seems to show a similar thing for both sides. The two biggest contributors are Connection at Impact and Vertical Bat Angle. I would say this dimension can be thought of as where the body and bat are at contact, so it could be considered another contact dimension, but one more focused on the point of contact, compared to the previous dimension which is more focused on the path to contact.
I find these three dimensions to also reflect well on our hypothesis. We see that the three total dimensions explain about 70% of the variance of the data and being able to do this with 3 metrics instead of 11 could be extremely valuable for a team trying to get a handle on how good a player is at producing power or making contact.
In our next article, we’ll dive into this a bit deeper, looking closer at where all Blast metrics are similar.