Since Buddahfan keeps trying to convince people of the issues with WP, no one is really discussing the issues (he or the people he talks to), I thought I'd try to put up a summary of the main criticism aimed at the statistic.
Before I start, you can take a look at this link for the full calculation of WP:
... or not, you can just trust me ;) But I would suggest you keep it open in another tab or something to reference for things like the marginal values assigned to rebounding, etc.
The main strength of WP is that it is correlated. Almost every value assigned in the WP formula comes from a regression of some sort. All the weights given to rebounds, field goals made, field goals missed, assists, etc. All of this is great - as WP is not meant to be a simple model that just does what we expect with the numbers, but correlates them mathematically and shows us what the numbers say about wins, rather than what we say about wins.
The way it works is that it finds how many points a player scores per possession (offensive efficiency) and how many points he gives up per defensive possession (defensive efficiency), then uses a correlation between wins and the efficiencies to find his WP. Fairly straightforward. The value of each action is given a marginal value - ie the value of that action compared to the expected result of a possession (which is roughly scoring 1 point). The table shown in the link above gives the marginal win-value of each action. These values are based on two things: the possession used in the action taken, and the points value of taking that action, and they result in the basic production value for a player. There are a bunch of other factors added in to get to WP, but they aren't the issue.
The points values of each action are obvious. A 3 pointer made earns you 2 points above the expected 1 point (the exact values are slightly different, but I am simplifying this for discussion). A 2 pointer made earns you 1 extra point. You can see this in the marginal win production values in the table at the link (3FG = +.064; 2FG = +.032).
The possession values are where the major criticism of the model comes. There are 3 possible outcomes for a possession (we'll ignore team turnovers and team rebounds, and ignore the effects of steals, assists, etc to simplify):
1) A turnover. This means your team loses the ball and the other team gains it - entirely because of you. A full possession is used in a TO.
2) A field goal made. This uses the possession. Simple.
3) A missed field goal and a defensive rebound for the other team. Here there is a sharing of the credit for the possession. You've lost "x" possessions by missing the shot, and the defender has gained 1-x possessions by grabbing the board. The value of x could be anything, depending on how you value defensive rebounds compared to offensive rebounds.
The other possibility from a missed shot is that your team gets an offensive rebound. This means that the "x" portion of a possession you lost when you missed has been regained - you don't lose the possession. So the value of an offensive rebound is -x possessions.
This is where the relative value of an offensive rebound versus a defensive rebound is determined. An offensive rebound nets your team x possessions, while a defensive rebound nets your team 1-x possessions. Again, x can be anything, and you could do anything from just using the ratio between the number of offensive rebounds and defensive rebounds in the league to just picking a value between 0 and 1 to determine it.
This is the one portion of WP that is NOT from a regression. The regression done for win production values of each action is based on the possession value of those actions. And the possession value is fairly straightforward when looking at made field goals and turnovers. But that x value is complicated. A case could be made for a variety of values for x, and the exact value used is not a stat-breaking issue. But the weight used for x changes.
UPDATE: Not So Friendly Stranger pointed out some areas this was confusing. And it seems that confusion was partially due to me writing it down wrong. So yeah. Reread this section please.
The possession formulas are as follows:
Possessions used = FGM + x*FGMS + 0.47*FTA + TO – x*REBO + (1-x)*DREBD + x*DREBTM
Possessions gained = DFGM + x*DFGMS + 0.47*DFTM + (1-x)*REBD - x*DREBO + DTO + x*REBTM
FGM = Field Goals Made
FGMS = Field Goals Missed
FTA = Free Throw Attempts - we'll ignore these
TO = Turnovers
REBO = Offensive Rebounds
REBD = Defensive Rebounds
REBTM = Team Rebound- we'll ignore this for now
DFGM = Opponents' Field Goals Made
DFGMS = Opponents' Field Goals Missed
DFTM = Opponents' Free Throws Made - we'll ignore these
DTO = Opponents' Turnovers
DREBO = Opponents' Offensive Rebounds
DREBD = Opponents' Defensive Rebounds
DREBTM = Opponents' Team Rebound- we'll ignore this for now
The formula is simplified as follows on Berri's site:
Possessions used = FGA + 0.47*FTA + TO – REBO
Possessions gained = DFGM + 0.47*DFTM + REBD + DTO + REBTM
Note that the FGA in the used formula is FGM (Made) and FGMS (Missed) combined, with the x coefficient on FGMS assumed to be 1.
The main issue here is the value of x. In the first equation, rewritten without FTA to simplify:
Possessions used = FGM + FGMS + TO - REBO
The value of each part is the same. So a FGM is worth as much as a FGMS is worth as much as a REBO. This tells us that x must be 1 in this equation.
In the second equation, a defensive rebound is worth as much as a FGM:
Possessions gained = DFGM + REBD + DTO
Here you can see that the opponents' field goals missed has disappeared - its value is 0. This makes sense, since the defensive rebound value (1-x) is 1, so x must be 0.
However, this means that the value of x changes depending on which side of the ball you are looking at. This is illogical. It means that for individual players, a missed field goal is as bad as a turnover - it uses a full possession. But at the same time, a player can gain a full possession by grabbing a defensive rebound.
So, consider this in terms of each team:
On any one possession use, the one team loses possession, and the other gains possession, so the total possession change is 2 (-1 for one team and +1 for the other) in one direction.
However, look at the statistics that can be attributed to individuals. A FGM or FGMS can be attributed to a single player. So can a TO, a REBO and a REBD. But the values of DFGM and DTO cannot. So from a single player perspective, the formulas are as follow (simplified):
Possessions used = FGM + FGMS + TO - REBO
Possessions gained = REBD
These show the various things a player can do to impact possessions. However, you will see that credit for a possession change is given twice in the case of a missed FG followed by a defensive rebound. The player missing the shot is charged with a possession used, while the defensive rebounder gets a possession gained. Contrast this to a made shot, where the shooter uses a possession, and no one on the the other team gains a possession. This is not logically consistent.
From this follows the criticisms leveled at the value of rebounding in WP. For the statistic to be logically consistent, the value of x should be constant on a given play. Although x could conceivably change with every variation of 10 players on the court, on a single play it should be consistent. If a value of x was assigned somewhere between 0 and 1, but not equal to 0 or 1, then the possession values of a defensive rebound, an offensive rebound, and a missed shot would all decrease relative to the possession values of made shots and turnovers.
This section also updated to correct previous error.
For example, if we use the league average ratio between offensive rebounds and defensive rebounds and then assign x based on the 'scarcity' of a rebound:
895 ORB / 3394 TRB = 26.4% of rebounds are offensive. So x would be 0.736 - giving an offensive rebound 3 times the value of a defensive rebound, since they are 3 times as rare. Also, this implies that in a missed-shot-defensive-rebound scenario, the defensive rebounder only gets about 1/4 of the credit for the possession change, while the shooter gets about 3/4.
Then the formulas would look like:
Possessions used = FGM + 0.736*FGMS + TO – 0.736*REBO + 0.264*DREBD
Possessions gained = DFGM + 0.736*DFGMS + 0.264*REBD - 0.264*DREBO + DTO
So this is the reason WP punishes low efficiency shooters and over-rewards rebounders (especially defensive rebounders). Other discussions such as using NBA journeymen or failure stories with high WP to try to disprove it are not mathematically relevant.
As for the reason most people still like WP? Well, even with the wacky distribution of possessions, the regression takes place afterward. So, using the points and possession values of the various actions, the correlation between wins and efficiency is applied. As such, the regression does some self-correcting for the incorrect possession definition. It clearly can't compensate entirely for the possession definitions, but it does have an impact. And of course the results are a forced correlation to wins - this is both a pro and a con. It means the results aren't purely mathematical, but are scaled up to include "intangible" effect. However, it does improve the relationship between team wins and individual players' WP.
Also as a note, the WP calculation is MUCH more complicated than what I have outlined here. I've used a very simplified version to show the basic flaws in the stat pointed out in depth at the APBR metrics board, located here:
To get a full understanding of the WP calculation, read the link I attached at the top, and get Berri's book. Even the link skips over some complicated calculations.
Sorry for the confusion in the initial post version. Let me know if you spot any more inconsistencies.