Baseball Faceoff
A deeper look at fantasy baseball strategy
Most Recent Article:
Please send comments or questions to baseballfaceoff@gmail.com

Discuss the columns or other topics in our forums at http://Forums.baseballfaceoff.com

Faceoff
The Power of Consistency

July 2, 2008
Part Three of Three

Chris,
I love the first parts of your analysis, and because I’m a total engineering nerd, I’m going to continue this statistics festival, but before I do, I want to say that before I read anything you did, I thought I’d be better off with consistent players. We’ll see if this holds true with my analysis.
 
First of all I’m going to use a random number generator to determine a category to analyze in an 18 team 6X6 H2H baseball league. The randomly generated category to look at was found to be RBI. This year (through 10 weeks), the lowest total RBI total by any team is 196. The highest is 339. The best record by any team in this category is 9-1-0. The worst record is 1-8-1. Surprisingly the team with the lowest RBI total does not have the worst weekly record nor does the team with the highest RBI total have the best weekly record. Here is a breakdown of RBI (through 10 weeks) by team and corresponding record.

 Team  W L
Total 
 Red Sox
 9 1
0
315 
 Giants  7  3  0  299
 Blue Jays
 7  3  0  298
 Yankees  7  3  0  339
 Cardinals  6  4  0  308
 Rays  6  4  0  249
 Royals  5  5  0  261
 Braves  5  5  0  264
 Orioles  5  5  0  278
 Mets  5  5  0  279
 Tigers  4  5  1  252
 Angels  4  5  1  256
 Brewers  4  6  0  236
 Rangers  4  6  0  268
 Athletics  4  6  0  216
 Twins  3  7  0  196
 Marlins  2  7  1  242
 Phillies  8  205

What I’m curious about, and what remains unanswered, is how a team that averages 33.9 RBI per week (the Yankees in the table above) will compare to his common opponents for the year (they have averaged 26.8 RBI per week). Here are the weekly results for the Yankees:

 Week Yankees  Opponent  Outcome 
 1
 26 35  L
 2  40  20 W
 3  48  25 W
 4  28  35 L
 5  45  37 W
 6  27  8 W
 7  27  43 L
 8  40  26 W
 9  27  25 W
 10 31   14

Note that the Yankees have the highest RBI total this year to date, but they have lost two more games than the leader in the category, the Red Sox, who are 9-1-0 so far. You can see that on the three occasions that the Yankees lost, they had fewer than their average RBI in that week. There, that proves it, it is better to get consistent performance out of your players than streakiness.

Ok, so I’m actually not ready to conclude that consistency is the winning way yet, instead, I’ll have to do a little more investigating first. I did an analysis of all players with 14 or more ABs in one week and found out that the average player in a week would have three RBI in twenty-two at-bats. If we had a team that gave us 220 ABs in one week (representing starters and irregular backups on a nine position fantasy offense, we can assume they will produce 30 RBI on average. Certainly, the more data we use the more statistically relevant our data will be, but for now we can assume that we will compare three different teams, the average Yankee team, the league average team , and the average Yankee opponent. Even though this is not necessarily true, we will assume that each actual team (ours and each opponent) will produce their average RBI in 220 at-bats (we’ll come back to this assumption later).  Note that when we are building these teams, we are still assuming that all players are equally capable of producing an RBI in an at-bat. In reality, we know that someone like Alex Rodriguez is more likely to have opportunities to drive in runs (and is perhaps even more likely to succeed in doing so) than say, Brad Ausmus. But to make this analysis possible, we have to make some assumptions!  In order to dig deeper into this investigation, I’ve decided that when comparing the three teams, we will look at what happens when these three teams play each other. Before I do that, I will assume that our data will follow a normal distribution ( I used http://davidmlane.com/hyperstat/z_table.html to get my probabilities). This means that we are assuming that the most frequent RBI total will be the average for each player on that team and that the further away from that average value, the lower the probability a player has of actually producing that number of RBI in a week. Because a player cannot have a negative RBI total, you will see that there is a significant possibility that a player will have 0 RBI in a week. Here are the three probability tables for the “average players involved” using the standard deviation generated over those 745 RBI in 5404 AB (the weekly RBI total of the players with 14 or more AB).


Probability a Representative Player Will Have a Weekly RBI:
 RBI  Avg Yank
Avg Opponent  Avg Team 
 0  7.64 13.90   13.4
 1  8.40  11.95  11.7
 2  12.88  15.80  15.7
 3  16.36  17.30  17.3
 4  17.21  15.69  15.8
 5  15.01  11.80  12.0
 6  10.85  7.35  7.6
 7  6.49  3.80  3.9
 8  3.22  1.62  1.7
 9  1.32  0.58  0.6
 10  0.45  0.17  0.2
 11  0.13  0.04  0.04
 12  0.03  0.008  0.01
 13  0.006  0.001  0.0015
 14+  0.0008  0.0002  0.0003

As you can see, the average Yankee hitter has a better chance of producing four or more RBI in a week and a lower chance of producing three or fewer RBI in a week. What is striking to me is the very low probability of a Yankee player producing 0 RBI in a week, it is almost half that of either of the other teams. In order to get data by which I can compare these teams, I have used a random number generator to randomly assign an RBI total for each position on each team (only Yankees and opponents) for each of the ten weeks. The total RBI for the team was then tallied and compared to each other team’s actual and average weeks. Here are the results:

 Week Actual Yankees  Sim Yanks  Avg Yanks  Actual Opponent  Sim Opp.  Avg Opp. 
 1  26  32 34   35 38   27
 2  40  29  34  20  28  27
 3  48  29  34  25  39  27
 4  28  24  34  35  24  27
 5  45  32  34  37  24  27
 6  27  38  34  8  29  27
 7  27  22  34  43  17  27
 8  40  40  34  26  33  27
 9  27  54  34  25  37  27
 10  31 32   34 14   31 27 

The resulting records for each opponent vs. the field would be:
 Week Actual Yanks  Sim Yanks  Avg Yanks  Actual Opp.  Sim Opp.  Avg Opp. 
 1  0-5-0  2-3-0  3-2-0  4-1-0 5-0-0   1-4-0
 2  5-0-0  3-2-0  4-1-0  0-5-0  2-3-0  1-4-0
 3  5-0-0  2-3-0  3-2-0  0-5-0  4-1-0  1-4-0
 4  3-2-0  0-4-1  4-1-0  5-0-0  0-4-1  2-3-0
 5  5-0-0  2-3-0  3-2-0  4-1-0  0-5-0  1-4-0
 6  1-3-1  5-0-0  4-1-0  0-5-0  2-3-0  1-3-1
 7  2-2-1  1-4-0  4-1-0  5-0-0  0-5-0  2-2-1
 8  4-0-1  4-0-1  3-2-0  0-5-0  2-3-0  1-4-0
 9  1-3-1  5-0-0  3-2-0  0-5-0  4-1-0  1-3-1
 10  2-2-1  4-1-0  5-0-0 0-5-0   2-2-1 1-4-0 

This yields an overall record vs. the field of:
 Actual Yankees
 28-17-5
 Sim Yankees
 28-20-2
 Average Yankees
 36-14-0
 Actual Opponent
 18-32-0
 Sim Opponent
 21-27-2
 Average Opponent
 12-35-3

If summarized a bit differently:

 Actual Teams
46-49-5 
 Sim Teams
 49-47-4
 Average Teams
 48-49-3

This doesn’t seem like much of a difference overall, though it would have nice to have had my average productivity rather than my actual performance. It would have also been nice to see my opponents put up an average performance rather than their actual performance, although I might have caught a break by playing the actual opponent each week rather than the simulation opponent. That being said, I think it’s more important to look at each team’s record versus each other team to get more perspective on the issue.


   Actual Yankees
Sim Yankees  Average Yankees  Actual Opponent  Sim Opponent Average Opponent 
 Actual Yankees
  5-4-1   4-6-0  7-3-0  6-3-1  6-1-3
 Sim Yankees
 4-5-1    3-7-0  6-4-0  7-2-1  9-1-0
 Average Yankees
 6-4-0  7-3-0    6-4-0  7-3-0  10-0-0
 Actual Opponent
 3-7-0  4-6-0 4-6-0     3-7-0 6-4-0 
 Sim Opponent
 3-6-1 2-7-1   3-7-0 7-3-0   3-7-0 
 Average Opponent
 1-6-3 1-9-0  0-10-0  4-6-0  7-3-0   


If you look on the vertical axis, you can see a team’s record (over the 10 weeks) against the other teams. On the horizontal axis, then, is the opposition’s record against the selected team. For example, if I look at the third row and fifth column of this chart, I find the record of the average Yankee team vs. the simulated opponent.

Does all of this data reveal everything? I’m not quite sure. It sure looks like over the course of a season a team that has poor totals in a category will generally lose that category in a head-to-head matchup to a team with superior totals in that category. I do see that the actual Yankee team has a better record than the simulation or average total would predict. Likewise, the actual opponent has a better record than the simulation or average total would have predicted. So I guess what I’m trying to say is: I’ll take my actual totals every week and not worry about consistency.

I promised a short explanation of why we should not assume that we can just take the average RBI total for our team and our opponents and use them for this analysis. The summary is that every team is different, so technically, I should produce a normal distribution of data for each team, then perform simulations for each team on each week (rather than clumping all of my opponents into one typical opponent). While technically more accurate, this type of analysis would take much more time than I’m willing to invest at this point of time. The data provided convinces me that I need not worry about consistency too much when putting together my team. Finally, not only do we see that every team is different, but we also find that each team can be very different in a given week. Injuries, days off, and owner changes can influence the players producing those RBI for a team. One thing I can be certain of is that you should maximize your RBI chances by having viable backups for as many positions as possible in leagues that let you make daily substitutions. If you are concerned with your peripheral statistics (batting average, on-base percentage, and/or slugging percentage) then perhaps you can focus on getting backups with good peripherals and take your chances on the accumulated stats (runs, homeruns, RBI, stolen bases).

Chris, it took some time, but I think this analysis is going to be helpful when I put together teams in the future. I hope you agree.




Faceoff
The Power of Consistency


June 21, 2008
Part Two of Three

My previous article detailed an experiment investigating the conventional wisdom that favors consistent hitters over streaky sluggers for Head-to-Head leagues. Click here to read that article if you missed it.

Last time the experiment showed little evidence to confirm the sensible-sounding idea that you want to skip over streaky hitters for your head to head league team. The conditions of the experiment were rather artificial and unrealistic, despite the novel ideas they demonstrated. No player plods along hitting a single dinger each week, just as no one “saves up” all four of his homers to spit out once every four weeks in chunks of four.

Therefore, I thought a different type of experiment was in order. Let’s make it a bit more realistic and base the number of home runs on weekly team output. I decided to define a bunch of different teams – starting with consistent teams that always hit 9 home runs in a week, and add variation with teams that hit between 8 and 10, between 7 and 11, etc., until we get to a terribly inconsistent team who’s output in any week can range from 0 to 18.

Thanks to the magic of random.org, it is possible, even easy, to produce massive lists of random numbers within a range that you define. I made lists of 2,600 random numbers (100 Head to Head league seasons) according to each condition, pasted it into a spreadsheet, and then let the program do team by team comparisons for me.

What I ended up with is shown in the table below (I didn't run all the match ups, focusing primarily on teams E and F):
 Team (HR range)
 Avg. W vs A W vs B W vs C W vs D W vs E W vs F W vs G 
 Team A (9)
 9  0             1213 1219  1228 
 Team B (8-10)
 8.987    0      1215 1217   
 Team C (7-11)
 9.01      0    1235 1204   1216
 Team D (6-12)
 8.999        0  1214 1221   
 Team E (0-18)
 8.995  1246 1252
1235  1251   0        1220  1205 
 Team F (0-18)
 9.12  1249  1245 1257  1252   1242  0  1253
 Team G (3-15)
 8.95  1170    1157    1240  1207  0




I can’t say I’m shocked with the results. I think this comparison shows that the effect of consistency is either insignificant – or left to random chance of match-ups. I see no real pattern in how teams perform against each other, so it is difficult to deduce any hard truths from the results, except that again, we see a seemingly insignificant or negative correlation with consistency. Below are a few comments on the results:
•    Team F has a very high average (9.12) and performs very well against all comers, but you would expect that given that high average. Less easily explained is Team E’s performance. They did very well against most teams (almost as well as Team F), despite having a similar (if not lower) average than their opponents.
•    Just to be clear – I think that if you simulated enough “weeks” the average would get closer to 9 for every team. It looks like 100 seasons may not have been enough to reach that point. But the similarity between the records of Teams E and F suggests that an inconsistent team may be better over the long haul.
•    Team G performed poorly against teams A and C, but much better against team E. That’s one result that I don’t understand. It seems fluky. G does have the lowest overall average among these teams, so maybe that is the main reason for the wild discrepancy in team G’s performance against teams E and F.
•    Even the biggest difference between our hypothetical teams is 59 “wins,” which is only one-half win per season. Both of those deficits were achieved by the G team – which coincidentally happened to have the lowest average of any of the teams.

I think that this experiment is as interesting as the previous one. Maybe I’m right and the distribution is no big deal. Or maybe my method remains too unrealistic. I might have to look around and figure out how to play around with what statisticians call a normal distribution, also known as a bell curve. That may be the best way to tackle this problem of inconsistency in head to head leagues.

But until then, I think we can say that it isn’t worth making a point of worrying about consistency. Now, Joe, it’s your turn to weigh in on this topic. Am I full of it? Should fantasy owners take consistency into account when building their teams? I still think that anyone who says so has a burden to prove that consistency can be predicted. I’m not so sure it is. Robinson Cano was pretty good in April of 2006, even though he started really slowly in 2007 and 2008. CC Sabathia has had solid Aprils as well as lousy ones.

In any case – even as a purely academic exercise, I think it’s been interesting. Readers are the true judges, however. Please get in touch with us – email us at baseballfaceoff@gmail.com, chat in the forums. We love it when our readers question us or comment on our articles. And if you have a question about another topic, please ask – we’re collecting good ones for our first-ever mailbag column.

All right, Joe, after some 3,000+ words, I guess it’s your turn.






Previous Articles:

Faceoff
The Power of Consistency

June 9, 2008
Part One of Three

Posted By Chris

This week Joe and I are going to tackle an issue that comes up from time to time in fantasy advice columns: should fantasy owners seek out consistent performers over streaky players when assembling head-to-head league teams? Whenever I’ve heard someone say that – I figure, yeah, that sounds like a pretty good idea. But I never looked at it in detail. It feels like it would be a good idea to grab guys who will produce consistently throughout the year, rather than hitters who alternate slumps and streaks. If your team produces consistent numbers week after week, it will be able to compete every week and you won’t have to worry about weeks when your team struggles to put up a line like 12 runs /2 HR/ 9 RBI / 1 steal / .202 Avg.

Nobody wants to have a team that will lay an egg every few weeks. But is the fear of a rough week a good reason to avoid streaky players? Do you value a player less if he traditionally starts slowly, or if his production drops like a rock every August 1st? Do you believe your team is better off with players that produce consistently, or with guys who are streaky and produce unpredictably throughout the year? These are the questions that should be examined – and will be in this Face Off.
 
For this round Joe and I will discuss how each of us plans for head to head leagues and recommend strategies to help you put together a better team – for now, and when next year’s draft comes around.

My assumption has always been that there are many attributes to weigh when considering players for head to head league teams. Streakiness is not one of them. I always thought it was just too much work to worry about it. My hope was that uneven production would even itself out with good weeks to make up for the bad ones.

For example, you never know if your team will need to get two or seven saves in a given week to win the category. They will happen when they happen. You do know that if you have three closers likely to save 30 games, and most teams in your league have just one or two, that your team should win the category quite often. There is always a chance that your three closer team will only pick up two saves some week, and your opponent’s lone closer will pick up three or four, beating you in a category that you expected to win. This makes head-to-head leagues unpredictable and, for some people, frustrating.

But then again, baseball is unpredictable. The best team doesn’t always win. The best players make outs in more than two-thirds of their attempts. Sluggers become legendary despite failing to hit home runs in more than 90 percent of their at bats. Because positive outcomes are rare, it is relatively common for overmatched teams to pull out victories, I, for one, deeply enjoy this aspect of random chance that comes up in head to head leagues.

With roto leagues it doesn’t matter when your players accumulate their stats. Any homer goes into the homer pile; the team with the biggest pile at the end of the year wins the category. In a head-to-head league, the ideal situation would be to have a team that rations its homers to accumulate one more than your opponent every week, and holds onto extras for the next time you need them, like roll-over minutes.

I think the best idea advice is: don’t worry about it. Go for players who will accumulate the best stats over the course of the year, regardless of the type of league. I can see an argument for avoiding players who consistently tire down the stretch in a league where there are playoffs in September, but I’m not convinced it’s worth worrying about. Primarily because I’m not sure these kinds of trends can be predicted with accuracy.

Before you email to tell me what an idiot I am, think about the assumptions that must be true to justify seeking only consistent players. The first assumption is that any trends you can find in past performance will be repeated this year. This is not trivial – for instance, statisticians have a difficult time proving the existence of clutch hitters, by any metric you want to choose. It may seem obvious that Player X is a great clutch hitter (David Ortiz, Derek Jeter, whoever you like), but your memory might emphasize successes and forget the failures. Secondly, you have to prove that consistency is actually a better outcome than feast or famine. That’s what I want to talk about in the second part of this article. I did a little experiment and the results are downright shocking.


EXPERIMENT 1: Examining the Concept

I performed some calculations to measure what really happens when you have streaky home run hitters on your team. First, the assumptions of this experiment: Because there are 26 weeks in a baseball season, I based my calculations on players who hit an average of 26 homers a year, or one per week. I broke this down into two types of players, those who hit one homer every week, and those who hit four home runs in a week, but just once every four weeks on average. This works out to be basically the same number of home runs over the course of a season. Obviously, neither of these hypothetical players is based on someone real - no one is this consistently this streaky - but I thought the results would be enlightening.

The “control group” of consistent players hit one home run every week, for a total of three homers each week. To me the question is – does a set of three inconsistent players hit more than three home runs more weeks than not? Because if they do, that would suggest that there’s no need to worry about the kind of consistency some people seek out.

The first thing I did was to run the probabilities for a set of three players, all the type of player who hits four homers in one week, at a rate of once every four weeks. The likelihood that a given player would hit his four home runs is one in four. The likelihood that he wouldn’t hit any is three in four. The likelihood that all three would hit zero home runs in a given week is ¾ times ¾ times ¾, equal to 27/64, or 42.2 percent.

The chances that Player A would hit his four when none of the others do is ¼ times ¾ times ¾ for a total of 9/64, or about 14 percent. The same rate holds for Player B or Player C smashing his four bombs while his teammates flounder. That means it works that there’s also a 42 percent chance that this “team” would hit four home runs in a week.

The chances that Players A and B would hit their four on the same week that C hits none is ¼ times ¼ times ¾, or 3/64, 4.7 percent. The same rate is true if the pair that hits homers are A and C or B and C. The chances that all three hit them at the same time is 1 in 64, or very low – 1.6 percent.

To review, 42 percent of the time this mini-team will hit zero home runs. But 58 percent of the time they will hit four or more. About 16 percent of the time (once every six weeks!) they’ll hit eight or more.

Remember, the team you are trying to defeat always hits three homers a week. This means that more often than not, your team of streaky players should beat the team of consistent players. I’ll have to admit that this surprised me quite a bit. And it prompted me to do even more math. I ran the same test for a team of 9 players.


EXPERIMENT 2: 9 Man Math Attack

I used the same two types of players to run what turns out to be a pretty complicated calculation when you do it for nine players. Again we are looking at two teams, one freakishly consistent, putting up nine home runs a week, and the other made up of nine different guys who hit four home runs in a week, but just once a month. Rather than narrate all my calculations for this one, I’ll present the results in a table (email me if you want to know my method). Basically, the outer two columns are the ones that matter:

# Home Runs  # Zero HR players, X of 4 HR players  X of Possible Combos    Total Percentage Chance
0                                                9,0                                      1                            7.5%
4                                                8,1                                      9                            22.5%
8                                                7,2                                      36                          30.0%
12                                              6,3                                      84                          23.5%
16                                              5,4                                      126                        11.72%
20                                              4,5                                      126                          3.91%
24                                              3,6                                      84                            0.84%
28                                              2,7                                      36                            0.11%
32                                              1,8                                      9                              0.01%
36                                              0,9                                      1                            Almost Zero

Hmm. This is less clear-cut than the three-player example. The pure numbers show that three times out of five (60 percent), the streaky team will lose the home runs category to a team that hits nine homers a week. However – it also shows that the streaky team will hit eight or more home runs an astonishing seventy percent of the time. Essentially you are right there with the consistent team more than two-thirds of the time. My interpretation is that it looks to be worth taking your chances with the streaky team.


What Have We Learned?

The idea that streakiness may be worth seeking out is not what I was expecting. It always seemed intuitive and even obvious that consistency was something you would want. Part of what these experiments demonstrate is the randomness of head to head leagues – it shows why a team that hits 150 home runs over a season sometimes beats a team that hits 250. I think this needs further examination.

There are other things I can do to examine this idea of consistency, and a lot of other categories to work with – and I will over time. Before Joe gets a chance to respond, I’m going to perform a different experiment on how home runs are distributed and see how that turns out.

With the type of streaky team I have been discussing, you might be destined to suffer many weeks of frustrating futility, but the calculations suggest that you could be better off in the long run. It’s a counterintuitive, somewhat bizarre result. If you can offer some criticism explaining faults in my reasoning I’d be open to hearing about it, but the numbers look pretty convincing. So please email me at baseballfaceoff@gmail.com.

And for Joe, as you start putting together your response - what are your views on the value of consistency in head to head leagues? Do you have anything to say about the findings of this experiment?

Sometimes questioning the conventional wisdom will pay off. Going back to basics and examining the justification for basic assumptions can help you find value where other players will miss it.












Web Hosting Companies