Sarris: What’s more important for a pitcher, command or stuff?

It’s a backfield, barstool argument that has gone on for years: What’s more important, stuff or command? What matter more: sharp, hard pitches with plenty of bend, or the ability to put those pitches in the right places?

Early on, we didn’t quite have the statistics to answer the question. We could look at things like strikeout rate and walk rate, but then we’d find that combining the two provided us with perhaps the best in-season estimate of a pitcher’s talent. And, anyway, neither of those statistics cleanly captures the skills we’re talking about. It takes command to get strikes, and you can have a low walk rate without command, for example.

But statistics have improved, and we’ve got a couple of new ones that might finally capture command and stuff separately. And if that’s true, we might be able to settle this argument.

From STATS Perform, we have Command+. In order to produce a number that hopes to assess a pitcher’s ability to do exactly with his pitches as he intends to do, STATS’ stringers at baseball games look at the catcher’s signs, the catcher’s target, the pitcher’s heat maps, and the hitter’s heat maps. They’ll then assign a target for that pitch and consider how far from that target the pitch ended up.

What’s nice about this approach versus more zone-based ones is that it’s obvious that a pitcher will intend to put a curveball in the dirt sometimes, an act that would be the result of good command, but which would not produce a pitch that landed in the corners of the strike zone. (This also separates command from control, because a well-placed pitch can be in or outside of the strike zone, while control generally refers to the ability to put the ball in the strike zone, anywhere.)

Command+ is calculated per pitch type and then scaled to league average, where 100 is average. STATS also aggregates it so that there’s an overall number for the pitcher, and not just one number for every pitcher’s separate pitch. Last year’s ten best starters by Command+ were Masahiro Tanaka, Tyler Mahle, Kyle Hendricks, Mike Leake, José Berríos, Zach Davies, Zach Eflin, Aaron Nola, Trevor Williams, and Zack Greinke.

From Driveline Baseball, a pitching lab headed by Kyle Boddy, who is now the pitching coordinator for the Cincinnati Reds, we have Stuff. That number hopes to capture only the effect that the velocity and movement of the pitch have on outcomes. It’s a number I’ve been chasing since I started writing about pitching and was convinced that stuff was the most important aspect to a pitcher’s success. Driveline sent over this explainer for the stat.

“Based heavily off of the work of Glenn Healey’s intrinsic run values, our Stuff Metric attempts to predict the value of each pitch thrown during the 2019 season given the velocity, movement metrics, and estimated arm slot of the offering. Our model also controls for pitch location, batter talent, the count, and any platoon advantages that might occur, leaving us with a value that quantifies only the physical characteristics of a pitch’s profile in units of runs.

The top ten best starters last year by this Stuff metric were Tyler Glasnow, Gerrit Cole, Garrett Richards, Walker Buehler, Sonny Gray, Frankie Montas, Noah Syndergaard, Luis Castillo, Jacob deGrom and Mike Clevinger.

Command. Stuff. Let’s gooooooo.

Running the numbers, we have Konstantinos Balafas, a YMCA basketball league-mate of mine and (more importantly) a data scientist. Balafas used a Random Forest machine learning model with around 7,000 pairs of pitcher Command+ and Stuff as the inputs. The result gave us some “contours” that showed us the predictive quality (read: importance) of each stat in explaining the variance in certain outputs. He then smoothed out the numbers to make the graphs easier to read and make the trends clearer.

First, we ran the numbers against ERA. Though ERA has flaws, it serves here as a general marker of overall quality.

You’re looking for the blue on this heatmap, and of course the bluest blues reside in the top right corner, where the pitchers have both command and stuff. But in order to look at the relative impact of each, just look at how much bluer it gets in the high stuff areas than in the high command areas.

You can ignore the next paragraph if you aren’t interested in the nitty-gritty.

The model wasn’t very strong in terms of predictive strength — the r-squared for in-set correlation was merely .094, and for out of set prediction .045, meaning that this combination of Stuff and Command+ explained less than 10 percent of the variance in the set, and predicted less than 5 percent of the variance outside the set. The predictive strength could be better if we smoothed less or took different approaches. One Neural Network model we tried had a 0.85 in-set r-squared for the same inputs and outputs, but that was probably an overfit model, as the out-of-set r-squared was less than zero.

The less technical takeaway is this: We haven’t figured out pitching’s secret sauce with these two stats, but we can maybe say that the relative importance of Stuff (which contributed 55 percent to the model) was larger than the relative importance of Command+ (45 percent). And the point here was to check the relative relationship between the two inputs in a visually compelling way.

In other words, stuff is more important than command when it comes to outcomes, and that didn’t surprise one MLB pitcher, not one bit.

“That’s what all the numbers say,” the Reds’ Trevor Bauer told The Athletic’s C. Trent Rosecrans. “If your stuff is elite, it doesn’t matter if you hit the center of the plate, it’s hard to hit.”

The pitcher even came with multiple examples.

“If you take someone with league-average stuff and you improve command, he’s going to get better,” he said. “If you improve his stuff, he’s going to get a lot better. You take someone like Shane Bieber who has really good command and he was good his first year out when he was 91-92 [mph], throwing two-seamers and stuff like that. He started throwing four-seamers and throwing 93-95 and then he was an All-Star.”

“I would not consider Gerrit Cole to have great command,” Bauer continued — and indeed, Cole has a 102 Command+, making him more above-average than elite in that respect. “He’s not a Kyle Hendricks type pitcher who never misses his spot. But his stuff is so elite that he misses a ton of bats. He’s a fantastic pitcher.”

Many fans are probably familiar with some of the analysis that Bauer is talking about, perhaps something as simple as the table below, which shows some of the power that stuff (here represented by simple velocity) can have on outcomes (here represented by swinging strike rates on four-seam fastballs since 2017). I’ve referred to myself as a “stuff-ist” because of facts just like this.

Velocity (mph)	Swinging Strike%
90-91	7.4%
91-92	8.1%
92-93	8.8%
93-94	9.5%
94-95	10.5%
95-96	11.7%
96-97	12.7%
97-98	13.5%
98-99	14.4%
99-100	14.9%

But command is worth more than a shrug and some hope — command is important. If you change the thing you’re trying to predict, command actually steps forward and becomes more important than stuff.

Check out, for example, what happens when we tried to use Command+ and Stuff to predict innings pitched per appearance using Random Forests — in other words, to predict if a pitcher was a starter or a reliever.

Now you’re looking for the red parts of the graph, and once again there’s an ideal corner of the graph (the top right red band) where pitchers have both great command and stuff. But the effect is reversed: There are hardly any starters with below-average Command+, and there are many starters with below-average Stuff. The predictive effect was stronger than it was than when we tried to predict ERA (0.084 r-squared), and the importance of Command+ was greater (it contributed 57 percent to the model) than that of Stuff.

Particularly important is the notable lack of starting pitchers with Command+ scores below 90. Granted, it’s for fantasy baseball purposes, but in my top 75 starting pitchers for the coming season, only two established pitchers had a sub-90 Command+: Dinelson Lamet (89.2) and Garret Richards (89.4). At the back end, Dylan Cease (89.8) and Josh James (87.5) threaten to join the group, and at the top end, Tyler Glasnow (90.3 Command+) proves stuff-based hope for the entire group. But that’s it. That’s the list.

When asked which was more important, Phillies pitcher Zack Wheeler thought it was command because of the quality of the hitters at this level.

“I want to say stuff, but I guess command because you need to throw strikes or that’s it,” he said in camp last week. “Up here, these guys have such good eyes that you have to make it look like a strike no matter what it is. You can have the nastiest curveball but if it doesn’t look like a strike at some point, they’re not going to budge at it.”

His teammate agreed that command was crucial to turning the lineup over.

“But pitching comes down to being efficient and consistent,” Vince Velasquez said. “You know they have those little nets where they have three or four holes? If you’re a little kid and you can perfect and learn how to go in each corner and every pocket, if you could do that every time, that’s consistency, then you’re in the category with legends with Tom Glavine, and Greg Maddux: spotters.”

So far we’ve looked at overall Command+ and Stuff, but each of those stats is calculated by pitch type — in other words, a pitcher has a particular fastball Command+, and slider Command+, and so on. So there’s more to learn about the interaction between command and stuff and pitcher results. Check out slider command and stuff run against overall ERA.

Command is a lot more important for sliders than stuff, it looks like — command contributed 63 percent to the outcome. Just follow the line for 120 slider Stuff on the graph above — you have to have nearly average command for it to start changing your fortune when it comes to overall ERA.

This is may not be all that surprising to some, and it’s a thing I’ve heard from team analysts before. It’s become more important as sliders are being used more now than ever — they were thrown more often last year than ever before — but also because the way sliders are being used is changing. Take a look at how often the slider was used in 2-0 counts by year.

The premium is on getting a strike in two-oh counts, but also not getting a ball. The solution, increasingly, is throwing a slider in a good part of the strike zone for the pitcher. So it follows that slider command would be so important in today’s game.

If you want to see the benefits of extreme slider command, look no further than New York, where it helped an ace declared himself (Jacob deGrom) and a veteran rediscover his value (Masahiro Tanaka). In general, you may see the benefit slider command can have on longevity when you look at last year’s slider Command+ leaders (minimum 500 thrown). Teams signing 3o-year-old starters to long-term contracts may want to take notice.

Pitchers	Team	Slider Command+
Masahiro Tanaka	Yankees	139
Jordan Zimmermann	Tigers	131
Zach Eflin	Phillies	126
Jacob deGrom	Mets	121
Miles Mikolas	Cardinals	120
Tanner Roark	Reds	118
Zack Wheeler	Mets	118
Kyle Gibson	Twins	117
Kenta Maeda	Dodgers	116
Justin Verlander	Astros	116
Zack Greinke	Astros	116
Sergio Romo	Twins	116
Jack Flaherty	Cardinals	113
Clayton Kershaw	Dodgers	113
Michael Pineda	Twins	113

DeGrom’s command actually rubbed off on his teammates last year. One in particular — Wheeler — saw something in his routine that he liked, and copied it.

“Watching Jake deGrom play,” Wheeler said as a thing that improved his command. “People ask about my command, and it has gotten better. Watching him, he’ll throw twice between starts and he’ll just start with ten glove-side fastballs on the first day, nice and easy. You see him pitch, his command is really good, so I thought, I’m going to do that. But I don’t throw two pens, that’s just how I start off my bullpen now. I did that for a full season and I think that really helped.”

That’s not necessarily slider command in particular, but it does provide an easy segue into our more surprising finding about command and stuff: For fastballs, command was also more important than stuff for predicting overall ERA.

For some, this is perhaps the most unsurprising finding in the article. But the lust after fastball velocity, the radar-gun fetishism, the simple table above and even trends in the first round of the amateur draft suggest that we’re chasing stuff on the fastball level. And yet there’s very little blue on the wrong side of 100 Command+, suggesting that fastball command is just as fundamental a skill as our old baseball truisms would have suggested. Command contributed 61 percent to the predictive model, at least.

There might be a link between our chasing after fastball stuff and the relative importance of fastball command, though. What if we’ve chased fastball velocity so far that we’ve made fastball velocity less unique? If that’s the case, then the way to separate your fastball from the pack is now command instead of more velocity and stuff.

This is more of a theory than fact, but here’s a little evidence.

Look at the shape of the two graphs here, and you’ll see how fastball velocity is becoming more homogenized — no bar in 2008 is as high as the three highest bars in 2019. As that shape has a higher peak, and we run up against peak velocities given current bodies and mechanics, the more likely it is that a player can better separate themselves from the average fastball by command than by velocity.

The pitch that was most stuff-dependent in our analysis was the changeup (64 percent of the importance came from the stuff metric). Stuff, for a changeup, is mostly built on movement and velocity difference off of the fastball, so it’s all intertwined. And, ironically, a changeup artist who has both stuff and command on the changeup can tell us a little something about the pitch — the Blue Jays’ Chase Anderson.

“I can throw that changeup behind the in count 1-0, 2-0, drop one in there for a strike,” said Anderson, who shares that ability and also elite changeup Command+ (111) with the Reds’ Luis Castillo, among others. “You want more weapons. As you get on the mound as a starter, you feel a lot more confident if you get up there with four pitches. Spring training for me this year is all about having all four pitches for strikes, and for chases.”

Anderson and his comments bring forth a couple of things: Changeup command is rare, and command and stuff are maybe not as independent from each other as we’d hoped at the beginning. Here’s Matt Boyd on slider command, to further the theme.

“I throw working lines,” he said of the skill. “My slider may end up middle down, but I want to work that away line. Working a fastball away, I want to throw a slider off of that. If I miss in, then that pitch didn’t work off the pitch before. Command is all predicated off that pitch before, as well as working the hitters’ weaknesses and strengths.”

Listen to slider command leader Tanner Roark on the value of that skill, and maybe the point here becomes clearer.

“If you leave the slider up just a little bit, and you don’t have that same bite you’re used to, they either foul it off or they crush it,” he pointed out. “The worst is when I have a pitch that I know I can strike them out with, and I lack the execution on it and he spoils it. That was your strikeout pitch, and you should have struck him out with it. If you can place it, if they spoil it, you can go a little further out and get the whiff.”

Roark is talking about execution of the slider here. Execution is replicating the intended movement and velocity and location. Replicating good movement and velocity is captured in Stuff, but it’s also captured in Command+.

Are stuff and command impossible to separate? Even if not, there’s a complicated tension between the two, as you can see from these pitchers’ comments. Every pitch plays better as you get ahead in the counts, so — even though we removed count effects from our Stuff metric — there’s still an intertwining that happens with these skills.

Also, although research on stuff metrics is now more than ten years old, slider stuff research has lagged behind, at least in the public sector. And even there my most recent attempt to understand sliders found that you really define each pitch in relation to a pitcher’s other pitches, and maybe even the pitch that came just before. As much as we try to separate these things to study them better, we find that each pitcher is really a unique combination of attributes, and it’s hard to split them into atoms.

There’s always more work we can do — we can break these relationships down further to see if command produces weak contact and stuff produces whiffs, for example. We might find that, in order to be a good starter, there’s a minimum number of commandable pitches a pitcher needs to have, and a minimum number of above-average stuff pitches. There’s always more research to be done.

But the fun part is learning, even if you learn you were wrong. Stuff is important! It’s really important. But command also helps explain why your favorite starting pitcher is a starter at all.

(Top photo of deGrom: William Purnell / USA Today)