This weekend, I have been writing about challenges with the data that drives the accuracy of tennis performance rating systems. One of the most significant—and complicated—problems is that player behavior itself can corrupt the information the statistical algorithms rely on. In yesterday’s post, I focused on how ratings management is currently playing out in Junior tennis. Today, I’m turning my attention to the Adult side of the game, where that practice is arguably an even bigger problem.
Ratings management in Adult tennis is tightly coupled with the NTRP system. It is the foundation of most adult recreational play in the United States. Under NTRP ratings, players are grouped into discrete arbitrary tiers. While level-based play is fundamentally good for the sport since it generally creates more competitive and, thus, enjoyable matches, it also carries significant downsides. Drawing hard lines between groups of players creates powerful incentives for ratings manipulation. Many Adults actively prefer to stay within a desired division where they can dominate rather than stepping up to the next level where they will struggle. That feeling is reinforced and amplified by the structure of USTA League play, which culminates in “National Championships” awarded at each NTRP division. Rather than motivating players to be the best they can be, level-based play coupled with National Championships incentivizes many people to actively manage their rating so they will win most of the time.
The phrase “win most of the time” was intentionally selected. player who wins all of the time at one level will inevitably be promoted to the next. Consequently, unlike the performance management practices described for Junior players yesterday, ducking competition in Adult tennis is exceedingly rare. (It is 3D-chess-level sophisticated when it does happen, however.)
The most common method for ratings management in USTA League tennis is to give less than full effort in a match. This is frequently referred to as “tanking,” though I am starting to believe that we should move away from that term because many seem to equate that word with giving zero effort or losing badly. I have gravitated toward words like “curate” and “modulate” to describe what these players are doing. This can look like losing at least a couple of games in each set against very weak competition. It is widely believed that split set matches are treated as an even match by the NTRP algorithm, regardless of the score in each set. Consequently, another rumored good method is to drop the first set and then give best effort from there.
A player in my very near orbit once played a match where his partner told him that they were going to win with a score of three and three. His partner, apparently responding to his dumbfounded expression, continued by telling him not to worry because the other three players on the court would take care of everything. The match was played to precisely the score predicted. When that fourth player later asked me what I would have done in that situation, I’d like to believe I would have refused to play the match. In his case, the fourth player didn’t feel he had enough social capital to do that without risking tennis ostracism. I can certainly understand that.
The USTA has implemented some compensation for ratings management within the NTRP algorithm by assigning some matches more weighting than others. Currently, whatever nominal value a USTA League match carries, local playoffs count more than that. Additionally, Sectionals are weighted heavier than playoffs, and matches played at Nationals matter most of all. The logic behind that progression is a belief that as the competition progresses towards a National Championship, the more representative matches will be of a player’s true performance level. That works if players don’t modulate their play during playoffs, Sectionals, or Nationals. It’s a flawed assumption.
When the NTRP algorithm was first conceived, results were collected on paper and then later batch-fed into the NTRP computer for processing. Given the workflow and the technology at the time, treating all matches equally at each tier of post-season play is the only realistic option. However, we now live in a world where capturing real-time results is possible.
A team that has been mathematically eliminated from their local playoffs may recognize the opportunity in the final rounds to lose some heavily weighted matches to hedge against potential promotions. Similarly, even in the semi-finals of the knockout rounds at Sectionals and Nationals, once a match is decided, players on both teams are incentivized to lower their effort to preserve energy or… also for ratings management. Non-advancing fall leagues are the devil’s playground for players intent on modulating their performance for ratings purposes.
Rather than weighting matches based on the competitive stage, it is now technically possible to weight matches on how much each one… actually matters. Consequently, matches for players on a team who absolutely must win their last match of the season to make the playoffs should count more than one played between two teams that have both been mathematically eliminated. A match at Nationals where both teams are winless after the first day shouldn’t count as much as one when one of the teams is still in contention. Similarly, during the knockout stages of playoffs, Sectionals, or Nationals, the first matches to come off should matter more than the ones that end after advancement has been determined.
Non-advancing Fall leagues played “for fun” shouldn’t count for much at all. I adamantly believe those matches should be included for ratings purposes, especially for those players who struggle to get enough matches in to achieve or maintain Computer NTRP ratings. On the other hand, those matches should be weighted less than matches played in advancing leagues. If participation declines when players suddenly become aware that non-advancing matches are weighted lower than Spring USTA League matches that advance to post-season play, that would be an appropriate time to ask them what “for fun” actually means. It shouldn’t be a euphemism for “ratings management opportunity.”
Matches should be required to be entered as they are completed for regular season league play. Once a team match is decided, any subsequent matches should count less. Among the many interesting artifacts that have increasingly started hitting my inbox, I have a screenshot from a captain informing a player that they have already won the team weekly matchup and advising them to “lose at least a set” in their makeup match that is still yet to come. Even at Nationals, teams eliminated from contention on the first day frequently treat their matches on Saturday as a vacation. Those matches should not be weighted as much as those played between teams that are still in contention.
Tennis is a test of skill… and character. The competitive framework in the game should reward those who are trying to be the best players they can be. Unfortunately, when the system incentivizes players to manage outcomes rather than pursue excellence, it chips away at the integrity of that competition. If the ratings algorithms used in tennis are meant to measure performance mastery, they must separate consideration of matches that do not matter from those that do. Technology now enables us to perform that differentiation.
The real question is not if the competitive tennis ecosystem can change how ratings are calculated but when it will. The longer this quiet manipulation is allowed to persist, the more we teach players that winning isn’t about playing better—it’s about playing the system. When that is the lesson, tennis loses.