I’m with Ag and Ed here. Put in a call to Tim Harikkala, for goodness sakes. There has to be somebody out there with an arm still attached who can do better than this bullpen bunch. I’ll show how bad our guys are, and two guys we got rid of that are a lot better than what we have, shortly.
First, the good news – Jon Gray had a great outing, even given a 70-ish pitch limit and a shaky 1st inning. Tonight’s 5 innings, 1 run, 4 hits, 5 Ks, no BB, 69 pitches, 50 strikes, 95.5 avg fastball is what a line should look like, length truncated. Walt said under no game conditions are they letting him go any farther in pitch count for the remainder of his outings this year.
Our starter average is 5.1 IP, anyway, so it’s not upsetting the mix we’re about to analyze.
The Rockies bullpen phone rings at least 4 times per game. It has faced more batters, given up more runs, more hits, and more walks than any in baseball, and that translates directly to terrible WHIP. Yes, some of that is on the short starts night after night, as we’ll see shortly. BAbip is also .001 from the worst in the league. We could look at SIERA, but it’s not the worst in the league as one would expect, so it’s not telling the whole story. RE24 also isn’t good, but the Rockies relievers are in the middle of the pack – again, something missing.
We know these guys as a group are horrific, and I’m turning to basic probability theory to explain why.
By far the worst thing I see is what I would call “reliability”. The average of runs/appearance is too much peanut butter – some outings are clean, some are really bad. What I want is the probability of any runs in a given appearance. I don’t really care how they score or how many score (there are other metrics for that), only if we prevent them from scoring on a per outing basis (not per plate appearance like RE24). To the game logs, where I can count outings for the short guys. I’m skipping Bergman and Flande, long guys are going to give up a run for sure.
Total appearances, followed by appearances with 1 or more runs yielded, percent:
LaTroy Hawkins: 24, 5, 21%
Boone Logan: 48, 11, 23% (but not as good as it looks considering only 30.1 IP)
Justin Miller: 17, 4, 24%
Tommy Kahnle: 32, 9, 28%
Scott Oberg. 42, 12, 29%
John Axford: 42, 12, 29%
Rafael Betancourt: 42, 14, 33%
Christian Friedrich: 49, 17, 35% (5 of those are 2 inning appearances)
Gonzalez Germen: 16, 6, 38% (incl Cubs appearances and 1 piggyback start)
Some of those burps are multi-run disasters, some are based on inherited runners instead of self-inflicted, and there is variation in length – it would take a lot more analysis to really dig into this and have a statistically precise discussion. We could also take the glass half full version, saying Kahnle is clean in 72% of outings, but I want the number this way for the conclusion.
I think it’s an interesting line of investigation. I want to put a guy in and have a reasonable chance of a clean outing with no runs, and I’m sure Weiss does, too. In fact, to succeed I need to put in several guys and have them all clean (or close to it, some days we actually have offense and we are ahead or score late). Yeah, I hear the complaints about leaving guys in too long or using the “wrong” guy, but the reality is we don’t have enough reliable guys and bringing in another unreliable guy in a bad situation does little for success.
If 4 relievers pitch each with a 25% chance (or higher) of giving up a run, we get almost certain bullpen runs every night. (4 mutually exclusive events assuming guys start innings, A or B or C or D fail, sum the probabilities of the individual events.) If starters were going 6 and we used 3 guys, the odds of success get better – and if we had better guys, they get a lot better.
From this view we can see why Hawkins was still valuable, why they keep going to Kahnle and Oberg in spite of the walks and long balls respectively, and why Miller is getting more chances. We also see why even if those guys are decent, we still get so many fails.
Meltdowns are to be expected, we just want the fewest possible. If we look at the bullpen with the fewest runs allowed – the Cardinals – and do the same analysis, here’s what we get on a quick sample for their most used guys:
Siegrist: 58, 7, 12%
Maness: 57, 12, 21%
Rosenthal: 53, 7, 13%
Belisle: 30, 7, 23% (hmmm, and we said we’d never miss him)
Villanueva: 27, 8, 29%
They get 6.1 IP on average, so we’re looking at 3 plus-side guys in the 7th, 8th, 9th. The numbers say they have 4 guys more reliable than our best guy right now, the top 3 on more appearances, proving the point. I’ll bet other better bullpens also support the theory.
I’ll say it again: this is such a disaster, there is not a lot Weiss can do besides load the gun and hope for the best. He is managing more to protect arms than anything else. I’m not saying he has not made mistakes, just that this is a no-win scenario on most nights – like tonight, 2 clean innings, 2 three-run melts. Yeah, don’t use Boone Logan against right handers, OK, that’s on Weiss. Yeah, Kahnle had help from Hundley on an error at a bad time. But for the most part, A or B or C or D in any order equals fail in the bigger picture.
Some sabermetrician is going to pipe up and say “I have an analysis for this and it’s called [blank].” (-WPA almost captures it. ARP might, but I haven’t seen data.) Please do, I’d love to find automated analysis already done that would prove or disprove this point. If not, I’m joining SABR and this is my thesis, I’ll find a better name for it.