VIEWS: 16 PAGES: 38 POSTED ON: 1/5/2010
THE METHOD OF COMPARABLES AND TAX COURT VALUATIONS OF PRIVATE FIRMS: AN EMPIRICAL INVESTIGATION Randolph Beatty, Susan Riffe, and Rex Thompson Randolph Beatty is the Distinguished Professor of Accounting, Susan Riffe is an Assistant Professor of Accounting, and Rex Thompson is the Collins Professor of Finance at Southern Methodist University April 1999 Key Words: Method of comparables, valuation, Tax Court Data Availability: All data are available from public sources. Contact corresponding author for guidance. We would like to thank Andy Alford; Jay Ritter; participants at the University of Michigan Tax Conference and AAA Western Regional Meetings; seminar participants at Southern Methodist University, University of Texas at Austin, and University of Houston; and two anonymous reviewers for helpful comments. Professor Riffe acknowledges the financial support of the SMU Arthur Andersen Junior Faculty Fellowship. We would also like to thank Business Valuation Services for the use of their research library. Corresponding author: Susan Riffe E-mail: sriffe@mail.cox.smu.edu THE METHOD OF COMPARABLES AND TAX COURT VALUATIONS OF PRIVATE FIRMS: AN EMPIRICAL INVESTIGATION SYNOPSIS: This paper introduces a series of valuation models that mimic important features of regulatory prescriptions and legal precedent for the method of comparables as applied in estate and gift tax cases. We evaluate the models using out-of-sample estimation to determine which ones have the most desirable statistical properties for prediction. We then compare the predictions from these models to the valuations put forth by taxpayers, the IRS, and judges in estate and gift tax cases. This analysis reveals that taxpayers and the IRS propose values consistent with their underlying incentives, but significantly different from most bias-adjusted forecast models. The judges choose values consistent with the average value of the two experts and with bias-adjusted forecast models. Thus, the tax litigation system appears to ultimately produce an estimate of value that is consistent with an objective application of the method of comparables to the available data. INTRODUCTION The imposition of estate and gift taxes requires the valuation of ownership interests in public and private firms. Public firms are typically valued for tax purposes by observing market prices. Since there are no observable prices for private firms, valuation experts must estimate value, and there are inevitable differences of opinion surrounding the appropriate estimates. According to IRC Regulation Section 20.2031, the actual market value of a similar publicly-traded enterprise should be taken into consideration along with other factors when valuing a private firm. This paper summarizes our analysis of one approach for valuing private firms that is consistent with this regulatory requirement, the “method of comparables.”1 The method of comparables is an attempt to argue by analogy that a private firm should have the same value as an identical public firm.2 This valuation method identifies an observable characteristic of both the private firm and comparable public firms that is associated with firm value (Cornell 1993). For example, a valuation expert might assert that the market price of a firm is proportionately related to reported earnings. If a set of comparable firms has an average P/e ratio of 10 and the private firm has EPS of $2, the value of the firm is estimated as $20 per share.3 If the expert estimates a discount for lack of marketability of 30 percent because there is not a liquid market for the stock, the implied final value is $14 per share. Although this example is simplistic, many expert valuations in estate and gift tax cases adopt an almost identical approach. The models used, the set of comparable firms chosen, and the marketability discounts applied are all subject to debate among litigants. The IRS has an incentive to maximize the estimated firm value to increase tax collections, while the taxpayer has an incentive to minimize the value to decrease tax payments. The legal system imposes costs and constraints that 1 encourage IRS and taxpayer experts to produce reasonable value estimates, but these estimates are still expected to be subjective and consistent with underlying incentives. The judge must then rely on these subjective and imperfect valuations in deriving the final estimated value. This paper compares valuations based on objective accounting models to those used in actual court cases. First, we establish a standard of comparison by evaluating a series of potential valuation models and determining which ones provide the best out-of-sample predictions. This exercise is motivated by the wide diversity of models used in practice and the lack of evidence about their relative statistical properties and forecasting accuracy when applied in an objective setting. Second, we compare the price estimates of our models to the private firm valuations of expert witnesses and judges for a sample of estate and gift tax cases. To our knowledge, this comparison has not been previously reported in the academic literature. This comparison is of interest because it is unclear whether tax court valuations, however estimated, are within a reasonable range of those suggested by the method of comparables models with preferable statistical properties applied to the same underlying accounting information. Previous academic research on the method of comparables centers either on what valuation models should be used or how comparable firms should be identified (Boatsman and Baskin 1981). LeClair (1990) found that an earnings capitalization model using industry specific discount rates was superior to an earnings capitalization model (called the adjusted book value model) that used different discount rates for earnings from tangible versus intangible assets. Further, he documented that considering assets and dividends information only slightly improves prediction accuracy. Using a median P/e model, Alford (1992) finds that defining comparability according to a 3 or 4 digit SIC code provides the most accurate price predictions and that further partitioning by total assets or return on equity does not marginally improve predictions. Cheng 2 and McNamara (1995) extend Alford’s work by showing that an equally-weighted P/e and P/b model is superior to a model using only one ratio. Using only several selected industries, Hickman and Petry (1990) document the smallest absolute prediction error for regressions including: (1) an intercept, dividend growth, beta, and dividend-to-earnings payout ratio or (2) no intercept, earnings, dividends, and book value of equity. We make several contributions to this academic literature. First, we evaluate the predictive ability of a broader range of models than considered previously and develop a statistical approach to identifying preferable models. We find that models based on weighted earnings and book value provide the best price estimates. Second, our paper is unique because it compares accounting model estimates based on the method of comparables with those used in tax litigation based on a variety of valuation methods. We find IRS valuations are persistently high compared to prediction models adjusted for bias, while taxpayer valuations are systematically low. Consistent with Englebrecht and Jamison (1979) and Englebrecht (1979), judges choose values similar to the average of the two experts. The judges’ values are also consistent with models based on the method of comparables adjusted for bias. This suggests that the adversarial system used in court results in reasonable estimates of value on average. This paper is also helpful to practitioners. Our analysis reveals how to compare the statistical quality of models and should help judges and attorneys tighten their price forecasts. The evidence reported can also provide a basis for judges and attorneys to defend or criticize certain applications of the method of comparables. Insights from this paper should also prove helpful in related contexts such as forecasting IPO valuations.4 The paper is organized as follows. The next section summarizes the regulatory pronouncements and court decisions that motivate an investigation of the method of comparables. 3 We then introduce the series of models considered. The models that are preferable for price estimation are established in the following section. Next, we compare court valuations to our preferred models’ valuations. Finally, the paper is summarized. EMPIRICAL IMPLICATIONS FROM TAX REGULATIONS AND LEGAL PRECEDENT According to IRC Regulation Section 20.2031-1(a), the value of a decedent’s gross estate is the total fair market value of all holdings at the time of death. Regulation Section 20.2031-1(b) defines fair market value as “the price at which the property would change hands between a willing buyer and a willing seller, neither being under any compulsion to buy or sell and both having reasonable knowledge of relevant fact.” Accordingly, Regulation Section 20.2031-2(f) suggests that net worth, prospective earnings power, dividend-paying capacity, and other relevant factors such as the value of comparable firms listed on a stock exchange should be considered in estimating fair value for private firms where market prices are not readily available. Section 4.01 of Revenue Ruling 59-60 provides specific guidance on the type of factors that should be considered in private firm valuation, but notes that the following factors are not intended to be all-inclusive: 1. The nature of the business and the history of the enterprise from its inception. 2. The economic outlook in general and the condition and outlook of the specific industry in particular. 3. The book value of the stock and the financial condition of the business. 4. The earning capacity of the company. 5. The dividend-paying capacity. 6. Whether or not the enterprise has goodwill or other intangible value. 7. Sales of the stock and the size of the block to be valued. 8. The market price of stocks of corporations engaged in the same or a similar line of business having their stocks actively traded in a free and open market, either on an exchange or over-the-counter. The last factor listed motivates the method of comparables as one of the many viable valuation approaches for court valuations. Revenue Ruling 59-60 Section 3.03 states in the absence of a 4 public market for stock, “the next best measure may be found in the prices at which the stocks of companies engaged in the same or a similar line of business are selling in a free and open market.” Implementation of the method of comparables ranges from a prediction of price based on an average comparable P/e ratio to more complex weightings of book value, earnings, and dividends. In Estate of Mark S. Gallo v. Commissioner, 50 T.C.M. 470, 476 (1985), the plaintiff’s expert witness, Cadenasso, inferred price per share from the average P/e ratio of comparable firms: Cadenasso believed that a stock’s price-earnings ratio (market price per share divided by earnings per share), computed using earnings that are representative of the company’s earnings power, was a reliable indicator of market value and the ratio most widely used by investors. Judges and experts retain latitude in specifying the relative importance of factors listed in Revenue Ruling 59-60. For example, the judge in Bader v. United States, 172 F. Supp. 833, 836 (S.D. Ill. 1959) suggests that book value is relatively unimportant: I am of the opinion that the major factors to be considered are book values, earning power and dividend capacity.... Book value is a factor to be considered, still it is not a reliable measure of fair market value. I am certain the investor is inclined to give earning power and dividend prospects much more weight in appraising the worth of any security. In Central Trust v. United States, 305 F. 2d. 393, 399 (Ct. Cl. 1962), the judge describes the plaintiff’s valuation as follows: There were four major factors which he considered in arriving at his conclusion. The first was book value. Utilizing the Company’s balance sheet as of December 31, 1954 (a date subsequent to the gift dates), the book value came to about $33 a share. In this connection he noted that the Company’s financial position at that time was sound, with a ratio of current assets to current liabilities of about 4.3 to 1. However, principally because of the age and multistoried inefficiency of the Company’s two main plants at Cincinnati and Norwood, he reduced the book value factor by 50 percent. The second factor was earnings. The Company’s audited annual statements 5 for 1952, 1953 and 1954, which he accepted without adjustment, showed that the average of its earnings for these 3 years was $1.77 a share. He felt that, in the case of this company, a price earnings ratio of 6 to 1 would be appropriate, but, recognizing that this was the most important factor, he weighted it to give it double value. The third factor was dividend yield. In said 3 years, the company paid an annual dividend of 50 cents. Accepting this figure as the dividend the Company would be likely to pay in the future, he concluded that an investor would look for a 7 percent yield on this stock, and capitalized it on that basis. The fourth factor was the prior sales at $ 7.50. Adding and weighting these figures, he derived a value of $10.50 a share. This example illustrates an aggregation of book value, earnings, dividends, and a prior sales price to arrive at an estimate of value where the weight applied to each variable is not rigorously motivated. Revenue Ruling 59-60 Section 5 notes that the circumstances of each case and the nature of the underlying business will help indicate the relative weight that should be applied to the different factors, but there is very limited guidance on how to determine these weights. Implementation of the method of comparables necessitates selection of a sample of comparable firms. Case law suggests that industry and size are important factors to consider. For example, the judge in Central Trust v. United States, 305 F. 2d. 393, 406 (Ct. Cl. 1962) criticizes the defense expert witness for selecting Crown Cork & Seal as a comparable firm because it operated in a different industry and sales revenue was too large relative to the Heekin Can Company: The witness’ study has certain meritorious features.... However, it has certain weaknesses too, the principal one being the limitation of the comparative companies to two, one of which, Crown Cork & Seal, leaves much to be desired as a comparative because its principal business is the manufacture of bottle caps and bottling machinery, an entirely different business. Only 40 percent of its business is in can production. On the basis of size too there are great differences. At that time, Crown, including its foreign subsidiaries, was doing about $115,000,000 worth of business as against Heekin’s $17,000,000. These examples suggest that experts are given only modest guidance in determining the critical modeling assumptions or most important variables for estimating values. There is great 6 diversity in the models applied and experts appear to emphasize or de-emphasize some factors with only qualitative justification. The variance in approaches used potentially leads to significant differences between the values suggested in court and those suggested by objective statistical models that effectively use the basic information available. In the following sections, we compare a progression of models that systematically consider many of the factors listed in Revenue Ruling 59-60 such as earnings, book value, and dividends. Based on Alford (1992), our analysis uses industry to identify comparable firms. However, we also explore the more contentious issue of whether controlling for size improves price predictions by including total assets as a variable in one of our valuation models. After we establish our preferable models, we compare their price predictions to those suggested by experts and judges in actual court cases. MODEL DEVELOPMENT The models considered in this section are suggested by legal precedent or the academic literature as good candidates. Although many other potential models could be explored, we have chosen to consider a series of parsimonious models that logically flow from the previous literature and generate interesting comparisons. We also limit our focus to modeling only earnings and book value accounting information because these items are identified as important in regulations, and we do not incorporate concepts such as “general economic outlook” that are difficult to systematically measure. Figure 1 summarizes our eight valuation models. (Insert figure 1 here.) Proportional Pricing The first five models listed in figure 1 are proportional because they do not contain an intercept. Each of these models includes different multipliers or ratios for earnings and book value. Accordingly, the weight or relative contribution of each multiplier must be specified. The first four 7 models equally weight the multipliers since this approach is prevalent in the academic literature. Although these models constrain the relative contribution of the multipliers, they are simple to apply. Further, they are less likely to overfit the valuations of a specific sample and then perform poorly in predicting prices for other firms not included in the sample.5 First, we give equal weight to the average P/e and P/ b multipliers because these ratios are commonly used in practice. However, it can be shown that basing price predictions on either the P/e, P/b or both ratios will yield upward-biased price estimates. Second, to mitigate against the upward bias, we equal weight the inverse average e/P and b/P ratios in the second model. This alternative corrects for the first model’s upward bias even though it is based on the same earnings and book value information.6 Third, we examine a model used by Cheng and McNamara (1995), equal weighting of median P/e and P/b ratios. Fourth, we investigate the ratio of averages multiplier (i.e., average price divided by either average earnings or book value) suggested by Lev and Sunder (1979). As seen in the previous section, judges and valuation experts sometimes give more weight to one piece of accounting information than another. Accordingly, we investigate a fifth proportional “regression weight model” that empirically derives the appropriate weights rather than assuming they are equal. First, a regression of price on earnings and book value is estimated without an intercept. The regression coefficients can then be multiplied by earnings or book value to estimate price. The weights are inferred by deflating the estimated regression coefficients by the inverse average e/P or b/P ratio.7 These empirically-derived weights will not necessarily be equal to one another or sum to one. 8 Linear Pricing Our final models are linear regressions with an intercept. In the linear earnings and book model, a positive intercept shows the priced future cash flow generating potential of firms with no current earnings or book value. In an additional model, we investigate the usefulness of dividends in a linear regression context because Revenue Ruling 59-60 suggests that dividendpaying capacity is another important characteristic in pricing equity. While managers have discretion over dividend policy and the role of current dividends in valuation is not obvious (Miller and Modigliani 1961), dividends can serve as a surrogate for omitted, priced firm characteristics such as expected future growth. Finally, given the previous discussion about the potential importance of size, we examine a linear regression model that includes total assets. Percentage Pricing Errors We advocate the use of percentage pricing errors, defined as actual price minus predicted price deflated by actual price, as a metric of forecast accuracy. Deflating pricing errors by actual price yields a relative measure of prediction accuracy that is more meaningful when comparing models for stocks with a large range of prices. For example, with percentage pricing errors a model that is off 50 cents for a $1 stock has the same percentage pricing error as a $5 pricing error for a $10 stock. With undeflated pricing errors, the $5 error will receive much more weight in evaluating model performance. We examine several forecast models that are expected to do particularly well according to percentage pricing errors. For example, proportional models based on inverse average e/P and b/P ratios yield zero mean percentage pricing errors.8 Further, regression-based models in which the input variables are deflated by the inverse of price serve to minimize the sum of squared percentage pricing errors. When deflating by price in the regression models, the left-hand side variable 9 becomes a column of ones and the right hand side variables are scaled by price. We use this approach for the last four models listed in figure 1. We call the proportional (without an intercept) and linear (with an intercept) models based on this approach the “deflated regression weight” and “deflated linear” models, respectively. ANALYSIS OF PUBLIC FIRMS Data (Insert table 1 here.) We collect data for all Compustat active and research firms from 1980 through 1992 that have complete price, earnings, book value, and number of shares for the current year.9 As shown in table 1, this produces over 73,000 observations. Because we estimate our models by industry subgroups and years, we require at least 20 firms in each 3-digit SIC code per year.10 We eliminate firms with earnings or book value per share less than or equal to zero to insure that proportional model results are meaningful. We also remove approximately the upper 1 percent of the earnings and book value distributions (by imposing the restriction that earnings not be greater than $9 per share and book value not be greater than $66 per share) to insure that results are not driven by extreme observations.11 These data restrictions result in a sample of 28,318 usable firm/year observations that represent 598 SIC code/years. Table 2 reports descriptive statistics and a correlation matrix for the variables used in the analysis deflated by price. (Insert table 2 here.) Research Design (Insert table 3 panel A here.) Table 3 presents the average coefficients, percentage pricing error descriptive statistics, and statistical comparisons for the eight models. We use per share data since this approach is prevalent 10 in actual court cases. In panel A of table 3, the coefficients are estimated for each 3-digit SIC code and year and then averaged over all the SIC codes and years. These coefficients represent the average amount multiplied by book value, earnings, and dividends per share as well as total assets to estimate price. The mean t-statistics in parentheses for the regression-based models are calculated across the 568 SIC codes and years. Asterisks indicate whether the mean t is at least 2, 10, or 30 standard deviations from zero. Under the null hypothesis, the mean has a standard deviation of 1 568 .04 2. Thus, for example, a mean of .68 is 16 standard deviations from the expected value of zero. Panel A of table 3 also presents mean and root mean squared error (RMSE) descriptive statistics for percentage pricing errors. The predicted price is determined using out-of-sample estimation where the models are estimated multiple times for each SIC code/year, dropping one firm for each iteration. Negative price predictions are replaced with zero, which improves the regression-based models only slightly.12 For each SIC code/year, the mean deflated prediction error is calculated, and the reported amounts are the averages across SIC codes and years. The global RMSE, which measures both dispersion and bias, is calculated by averaging all 28,318 squared deflated pricing errors and taking the square root. (Insert table 3 panel B here.) Panel B of table 3 summarizes pair-wise comparisons of each of the proposed models. The asymptotic z-statistic shows the number of standard deviations the mean difference in squared deflated prediction errors is from zero. A positive (negative) statistic with an absolute value greater than two indicates that the model represented in the column is significantly superior (inferior) to the model represented in the row. 11 Results Expert testimony in numerous estate and gift tax valuation cases uses average P/e and/or P/b ratios. For example, in the Commerce Clearing House IRS Valuation Training for Appeals Officers: A Coursebook (1998), the only model presented within the content of the method of comparables is the average P/e model. Despite the popularity of these ratios, our analysis suggests they clearly provide the poorest price predictions.13 Panel A of table 3 documents that the average P/e P/b model has the largest absolute mean deflated prediction errors and RMSE of the models considered, and the large negative z-statistics in the first column of panel B confirm that the differences are significant. The negative mean error reported in panel A is consistent with the model suffering from extreme upward bias (i.e., predicted price greater than actual price) on average. The inverse average e/P b/P model produces a substantial decline in both the magnitude of the mean deflated prediction errors and the RMSE. As expected, the mean percentage error is close to zero. The RMSE for percentage pricing errors declines more than 80 percent from 4.41 to .84. 14 Experts should be aware of the improvement in forecast accuracy from this rather innocuous rearrangement of the same accounting information. Panel B of table 3 shows that the inverse average e/P b/P model is superior to all of the other equal weighted models as well. There is no significant difference between the ratio of averages and median P/e P/b models. The regression weight model is potentially advantageous because it doesn’t constrain the weights to sum to one and allows the weights to vary by industry. Panel B of table 3 shows that among the entire set of proportional models the deflated regression weight e b model is preferable. Table 3 reveals several surprising results for the linear models. First, the premier deflated linear model is based only on earnings and book value. The addition of dividend and asset15 information in the regressions does not improve out-of-sample forecasts and, in fact, reduces 12 forecast accuracy.16 Second, the best linear model does not outperform the best proportional model in out-of-sample forecasts. These results suggest that the estimation error in the intercept, dividend and asset coefficients hurts the out-of-sample forecasts more than the average level of the coefficients helps. Models including these additional variables apparently overfit the data so that the resulting forecasts are not as accurate.17 In summary, the data reveal that the average P/e and P/b model commonly used in court case valuations produces the poorest forecasts of the eight models examined. While the inverse average e/P b/P model improves forecast quality, court experts have not generally used this model specification. The use of a deflated regression weight model can also be recommended on the basis of forecast accuracy and is an improvement over both equal weight models used previously in the academic literature and models commonly used by experts that give book value very little weight. Our evidence does not lend much support to the regulatory prescription that experts should incorporate dividends and assets into their valuation estimates. Thus, our set of preferable models includes the inverse average e/P b/P model, the deflated regression weight e b model, and the deflated linear e b model. These models will be our standard of comparison for the valuations put forth by experts in tax court settings. ESTATE AND GIFT TAX VALUATION SAMPLE RESULTS Incentives of Taxpayer Experts, IRS Experts, and Tax Court Judges Our second section suggests that experts can strategically select (1) the valuation model, (2) the sample of comparable firms, and (3) parameter estimates (e.g., the marketability discount). The expert’s valuation is expected to be biased toward the benefit of the party compensating them. Without cost, expert witnesses would rationally search for the combination 13 of method, sample, and parameter estimates to optimize tax payments for the party they are testifying for. The legal process constrains this opportunistic behavior because experts are repeat players in the valuation arena, and documentable lack of consistency in a valuation approach across cases can cause impeachment of an expert’s testimony. In addition, rules of evidence and legal precedent suggest the judge is not obligated to accept exaggerated testimony.18 For example, the judge in Estate of Mark S. Gallo v. Commissioner, 50 T.C.M. 470, 481 and 483 (1985) dismissed the respondent’s valuation as “unreliable” and containing “fatal flaws.” Clearly, an expert valuation that is deemed inconsistent or unreasonable by the judge can harm the case and impose significant costs on the party they are representing. U.S. Tax Court judges are appointed by the president for 15 year terms, and their incentives to accept, reject, or consider expert valuation testimony are not obvious. Although judges may have preferences for certain valuation approaches, since the IRS and taxpayer legal counsel are repeat players in the estate and gift tax area, each side is capable of anticipating a judge’s preferences. Importantly, the burden of proof in court is usually on the taxpayer to show that the IRS has made an erroneous claim in their notice of tax deficiency. Accordingly, where both sides present equally credible expert testimony, the court will tend to favor the IRS position.19 These observations of the structure of the court system suggest that consistent with their incentives, the taxpayer expert will offer a low valuation while the IRS expert will offer a high valuation. However, these expert valuations are constrained to be consistent with approaches that are broadly acceptable to tax courts. 14 Analysis of Private Firm Valuation Court Case Sample (Insert table 4 here.) We identify a sample of estate or gift tax cases involving private firm valuation from Commerce Clearing House’s Tax Court Memorandum Decisions (1975-1993) and 1996/1997 cumulative edition of the Federal Tax Valuation Digest compiled by Idelle A. Howitt (1998). These sources capture regular and memorandum decisions from the U.S. Tax Court, Claims Court and District Courts through 1995. We limit our investigation to taxpayer date of death occurring after 1960 so we can use Compustat data to estimate our valuation models for comparable public firms. We are unable to use 34 of these cases in our analysis because they do not provide earnings and book value per share information for the private firm.20 Similar to our approach for the public firm analysis, we eliminate two cases because the private firms have negative earnings. Table 4 lists case identifier and court information for the 31 cases in our final sample. The year in parentheses is the date of the court opinion. The numbers in the columns represent the date of death or gift and the SIC code of the private firm. (Insert table 5 here.) Table 5 summarizes the percentage of cases in our sample where the summary opinion indicates that specific variables, ratios or valuation methods were used by the judge or expert witnesses in the case. Not surprisingly, all cases used accounting earnings in estimating value. This can be compared to the 90 percent of cases that used book value and 68 percent that used dividends. The average P/E and P/B ratios included in our first model were used in 84 percent and 48 percent of the cases, respectively, despite the upward bias they generate. The method of comparables is the predominant method in our sample and was used in 87 percent of the cases. 15 52 percent of the cases used the adjusted book value method, which converts the book value of equity to market value based on the estimated value of individual assets and liabilities. Discounted cash flow analysis was used in only 32 percent of the cases. Although the values put forth in court cases are based on numerous variables and methods, our purpose is to compare these values, however determined, with estimates obtained from an objective application of the method of comparables to the data available. (Insert table 6 here.) Table 6 tests for differences across taxpayer, IRS, and judge estimates of the gross valuations (before subtracting the discount), the discount for lack of marketability, and net valuations (after subtracting the discount).21 The dependent sample t-tests confirm that the taxpayer expert mean values of $567 gross ($386 net) are significantly lower than the judge valuations of $950 gross ($676 net). Also, the judge valuations are lower than the IRS means of $1333 gross ($1046 net). Interestingly, the marketability discount estimate magnifies these differences because the taxpayer (IRS) net estimates are significantly higher (lower) than the judge net estimates. Similar differences are documented for the nonparametric Wilcoxon signed ranks tests, which may be a more reliable test since the data distributions are clearly skewed.22 Thus, our sample results verify the intuition that valuation experts carefully implement their chosen valuation methods to further the objectives of the party they are testifying for and, accordingly, choose values significantly different from one another.23 (Insert table 7 here.) We now examine the belief that judges “split the difference” between expert witness valuations. A New York Times article by Drew and Johnston (1996) suggests: Lawyers note that it can be hard to agree on what a business, and especially a piece of one, is worth. The Tax Court itself, they say, sometimes seems inclined to split the 16 difference between claims made by the IRS and valuations proffered by heirs to an estate (15). Englebrecht and Davison (1977) and Englebrecht (1979) provided evidence consistent with the court choosing a “compromise value” equal to the mean of the extreme values put forth by the IRS and taxpayer in estate and gift tax valuations. However, Englebrecht and Jamison (1979) concluded that the court did not tend to choose compromise values for property in charitable contributions cases. For our sample, the judge valuations are very close to the average of the expert valuations. Panel A of table 7 reports the judge gross mean (median) private firm valuation is $950 ($125) per share compared to $950 ($104) per share for the average of the two expert witnesses.24 The result is consistent with the notion that judges use a simple averaging of the expert testimony. Round and Erickson (1996) suggest there is a trend toward greater sophistication in the court so that judge valuations in more recent periods are less likely to simply represent an average of the two experts. In Sirloin Stockade v. Commissioner 40 T.C.M. 928 (1980) the judge rejected the IRS valuation because he suspected that they brought a weak case before the Court to obtain a compromise value. In Buffalo Tool and Die Mfg. Co. v. Commissioner 74 T.C. 441 (1980), the judge stated that he may not choose a “middle-of-the-road compromise” even though each of the parties might expect him to do so. In panel B of table 7, we test this conjecture by dividing our sample into cases decided in 1980 or before versus after 1980. There is not a significant difference between the expert average and the judge value in either time period, although the t-statistic is larger in the later period. Similar results were obtained when separately analyzing cases before and after 1985. (Insert table 8 here.) 17 Table 8, panel A compares private firm valuations from our earnings and book value models25 to valuations offered by taxpayer experts, judges, and IRS experts before considering any discount for the lack of marketability.26 For each case, unique model coefficients were estimated for comparable firms from the year prior to the date of death with the same SIC code as listed in Table 4. 27 Price predictions were calculated as the coefficients from the comparable firms multiplied by earnings and book value for the private firms. The predictions were deflated by the judges’ values to yield percentage pricing errors in the spirit of those reported for public firms in the previous section. In panel B, the average bias in each empirical model was removed by adding or subtracting the mean percentage pricing error reported in table 3. This approach is reported because in a forecasting context the known average bias of a particular model can easily be removed to improve predictions. The Panel B comparisons can be interpreted as the expert’s prediction based on his or her preferred valuation approach versus the model’s unbiased prediction based on the method of comparables. Before adjusting for the average bias, as reported in panel A, the average P/e P/b model leads to valuations significantly above those of all of the participants in the tax court process. Our best performing models given the analysis in the previous section are the inverse average e/P b/P model, the deflated regression weight model, and the deflated linear model with earnings and book. The judge has a responsibility to evaluate the evidence presented by the experts and provide a fair judgment on the private firm’s value. Ideally, this value would be similar to those predicted by the most accurate models. Panel A of table 8 confirms that the judges’ values are not significantly different from those of our three best models. The only exception is that the Wilcoxon test is significant at the .10 level for the deflated regression weight earnings book model. The IRS values are significantly higher than these models’ values, consistent with their 18 incentive to maximize tax collection. The taxpayer values are significantly lower than the values for the inverse average e/P b/P model, as expected, but they are not significantly lower than the values for the deflated regression weight and deflated linear earnings book models. Panel B of table 8 puts all of the models on a more equal playing field by removing the average bias as reported in table 3 for a particular model. The first column of panel B shows that the means and medians for the models are now very similar. The most striking difference in panel B compared to panel A is for the average P/e P/b model where a large downward adjustment of $1.18 per share brings the values for this model more in line with those of the other models. By contrast, the inverse average e/P b/P model has only slight bias on average so the valuations are only adjusted downward by .02 per share. Panel B reveals that the judges’ values are now not significantly different from any of the bias-adjusted models. Further, with the exception of the average P/e P/b model, the taxpayer values are significantly lower than the biasadjusted valuations. Alternatively, with the exception of the ratio of averages model, the IRS values are significantly higher than the bias-adjusted amounts. Apparently, the judge is able to compensate for the bias in the experts’ valuations and present a statistically unbiased estimate of value.28 SUMMARY, CONCLUSIONS AND IMPLICATIONS Our analysis for public firms shows that some of the common valuation approaches are not preferable. Although the average P/e P/b model is well-accepted, the resulting bias for the price predictions is quite severe relative to reasonable alternatives not generally used in court cases such as the inverse average e/P or b/P model. Although none of the actual court cases we read used regression analysis to estimate values, the statistical advantages to regression are well known; and we show that a deflated regression weight model is superior to other models considered. In 19 addition, we have demonstrated that it is inappropriate for experts to routinely dismiss book value as unimportant. Finally, including dividends and assets along with book value and earnings, although supported by legal precedent, may actually harm rather than help price predictions in outof-sample contexts. The results from the analysis of private firm valuation tax cases highlight behavior by expert witnesses that is consistent with their underlying incentives. Further, the IRS and taxpayers choose values significantly different from one another and different from the inverse average e/P b/P, deflated regression weight, or deflated linear models in directions consistent with their objectives. Finally, judges’ values are not significantly different from our best forecast models, which suggests that the adversarial system in tax court results in firm valuations that are unbiased in a statistical sense. Although our conclusions are based on a small sample, these results indicate that the tax court is ultimately effective in assigning values to privately held firms. Whether a less costly alternative to adjudication exists remains an open question. We have emphasized estate and gift tax valuations of private firms. This setting requires an implementable technique for establishing and resolving disputes over the proper tax owed to the government. Although tax valuations are interesting in their own right, our results on model structure are relevant for other settings that utilize the method of comparables. Mergers and acquisitions, initial public offerings of equity securities, civil court actions, and corporate control contests are all arenas in which participants seek to establish independent firm valuation. Although valuation experts use numerous methods in these contexts, the method of comparables is generally an important input to an expert opinion. 20 ENDNOTES 1 We acknowledge that there are other viable approaches for valuation such as discounted cash flow analysis and the appraisal method. However, we chose to focus on the method of comparables because it is one of the most commonly applied methods in practice. 2 The method of comparables is an abstraction in the sense that private and public firms tend to differ along many dimensions such as cost of capital, number of shares outstanding, and agency issues between managers and owners. Despite these differences, practitioners and the court must believe that the method of comparables provides useful evidence about the value of private firms because it is one of the predominant private firm valuation methods (see Table 5). 3 Throughout the paper, P/e refers to the price-to-earnings ratio and P/b refers to the price-to-book ratio. The linear, but not the proportional, models we examine can be used for IPO firms with negative earnings and 4 book value (see figure 1). 5 Our empirical work shows overfitting to be extremely important. The following intuition explains why the inverse average e/P b/P model is expected to improve predictions. With 6 the P/e or P/b ratio, the measurement error in earnings or book value is reflected in the denominator. This creates convexity so that the price predictions are upward biased. The inverse average e/P or b/P ratio contains the measurement error in the numerator of the ratio and avoids this problem. For example, consider two firms, each with long run average economic earnings of $2 and price of $40. One firm has positive measurement error of $1 and reported earnings of $3 while the other has negative measurement error of $1 and reported earnings of $1. The average P/e ratio for the two based on reported earnings would be (40/3 + 40/1)/2=26.67, yielding an average price forecast of $53.33, with one price of $80 and one of $26.67. The inverse average e/P ratio multiplier would be 20 (1/((3/40+1/40)/2)), which yields an average price forecast of $40 and separate prices of $60 and $20. An appendix containing a formal proof is available from the authors. 7 For example, from table 3 panel A the average coefficient on earnings of 4.88 can be deflated by the average inverse average e/P ratio of 6.1 to yield an inferred weight on earnings of .80. The inferred weight on book value is .78. 8 It is easily shown that the average ˆ P / P from the models is unity. 21 9 We use earnings before extraordinary items and book value of total stockholders’ equity. We define dividends as current earnings minus the change in book value. This measure is essentially all non-earnings changes in capital where stock issuances are viewed as negative dividends. However, if lagged book value is not available, we use actual dividends paid to avoid eliminating these firms from our sample. Our results were not affected by using actual dividends for all firms. 10 Alford (1992) showed that prediction accuracy increased when comparable firms were defined at the 3-digit compared to the 2-digit level. The restriction on the minimum number of firms per SIC code provides a reasonable number of observations to estimate regression-based models. Sensitivity analysis showed that the performance of the nonregression-based models was the same when using 10-19 firms per SIC code/year. 11 Experts commonly remove extreme observations in defining their set of comparable firms. Only .7 percent of the observations that could have yielded negative price predictions (i.e., regression-based 12 proportional and linear models) had to be replaced with a zero price. Our primary inferences are not sensitive to this adjustment. 13 Our conclusions are based on our large sample of public firms. The complete expert valuation process may implicitly adjust for some of the statistical problems using sample selection techniques or other modifications to mitigate against the inherent flaws in some of the models. 14 Analysis not included in the paper revealed using either the univariate inverse average e/P or median P/e model alone produces results that are superior to an equal-weighted average P/e P/b model. Thus, it is better to use less information with an improved specification than to use more information in a badly misspecified form. 15 Adding assets, but not dividends, does not significantly improve the model forecasts relative to the model without assets. 16 This result for dividends and assets is consistent with LeClair (1990). However, because LeClair’s “adjusted book value” model is an earnings-based model that did not include actual book value of stockholders’ equity, a direct comparison of our results with LeClair’s is not possible. 17 Table 3 panel A reveals that the mean t-statistics across 568 experiments for the intercept, dividends, and assets are much smaller in magnitude than the mean t-statistics for earnings and book even though all of the mean t-statistics are at least two standard deviations from zero. The out-of-sample results do not lead us to conclude that the appropriate intercept, dividend or asset coefficients should be zero. To understand how overfitting hurts out-of-sample forecasting, 22 suppose the true generating process has an intercept of .5 but sample sizes of, say, 20 firms, tend to produce intercepts with a mean of .5 but a standard error of one. In sample, the addition of the intercept will always help predictions because regression minimizes the sum of squared prediction errors. Out of sample, however, it’s a different story. Using a zero intercept will predict better than using an estimated intercept whenever the estimated intercept falls outside the range zero to one because zero is closer to the true intercept than the estimated intercept. This will occur approximately 60 percent of the time (the area under the normal distribution outside the range .5 ). Thus, a model that fixes a small but nonzero coefficient could improve price predictions over models with zero coefficients, but this alternative requires an approach to forecasting beyond the scope of this paper. 18 The court may reject an expert witness opinion that is contrary to the judge’s assessment (see Kreis Estate v. Commissioner, 227 F.2d. 753, 755 (6th Cir. 1955)). The court may also reject expert testimony where the witnesses’ opinion of value is deemed exaggerated (see Chiu v. Commissioner, 84 T.C. 722 (1985)). 19 The IRS Restructuring and Reform Act of 1998 shifts the burden of proof to the IRS in certain limited circumstances where the taxpayer substantiates items with adequate records, cooperates with IRS information requests, and has net worth below certain limits. 20 We were also unable to obtain earnings and book value information from the underlying documentation on file with the U.S. Tax Court. For some cases, the plaintiff requests that the information be kept confidential. In other cases, the court destroys the underlying documentation and briefs after a very short period. Thus, we were forced to rely on the summary information in the judges’ opinions. Where the amount of the discount for lack of marketability was not reported but a discount was implied by the discussion, we used the averages from other cases. 21 The mean private firm valuations per share are significantly higher than the price per share of a typical public equity security. The motivations for differences in the number of shares issued and outstanding in private and public firms remains an open question. However, the differences in price level per share do not appear to limit the use of the method of comparables in estate and gift tax valuation cases. 22 The skewed distributions are not driven by a few outlier observations. For example, there are eight cases where the plaintiff estimated values are greater than $500 per share, eleven where the IRS is greater than $500, and seven where the judges’ values are greater than $500. 23 These results also reflect that only cases where there is a significant difference in opinion about valuation will be litigated and then included in our sample. 23 24 Similar results were documented for the unreported net valuations. Many court cases do not report dividends and assets so we cannot report results for models containing these 25 variables. 26 The gross values before discounts are used because the method of comparables generates values assuming the firm is public. The discounts for lack of marketability then adjust the public firm value down to a net private firm value. 27 At the time of death, only the previous year’s accounting information would be available for estimating values. Sensitivity analysis showed that the table 8 results did not differ significantly when removing the cases related to 28 gifts rather than estates or when removing the cases tried in a District Court or Claims Court rather than the U.S. Tax Court. Further, the results did not differ when partitioning the sample by the judge’s years of experience or when restricting the analysis to those cases tried after 1980. Finally, the results did not change significantly when removing the four cases that did not indicate that the method of comparables was used by the judge or experts in the case. 24 REFERENCES Alford, A. 1992. The effect of the set of comparable firms on the accuracy of price-earnings valuation method. Journal of Accounting Research: 94-108. Boatsman, J. and E. Baskin. 1981. Asset valuation with incomplete markets. The Accounting Review: 38-53. Cheng, A. and R. McNamara. 1995. The valuation of the price-earnings and price-book benchmark valuation methods. Working paper, University of Houston. Cornell, B. 1993. Corporate Valuation--Tools for Effective Appraisal and Decision Making. Homewood, Illinois: Business One Irwin. Commerce Clearing House. 1975-1993. Tax Court Memorandum Decisions. Commerce Clearing House. 1998. IRS Valuation Training for Appeals Officers: A Coursebook. Chicago, IL.: CCH, Inc. Drew, C. and D. Johnston. 1996. For wealthy americans, death is more certain than taxes. The New York Times (December 22): 14-15. Englebrecht, T. and D. Davison. 1977. A statistical look at tax court compromise in estate and gift tax valuation of closely held stock. Taxes (June): 395-400. Englebrecht, T. 1979. A reply, analysis, and extension of a closer statistical look at tax court compromise. Taxes (September): 607-614. Englebrecht, T. and R. Jamison. 1979. An empirical inquiry into the role of the tax court in the valuation of property for charitable contribution purposes. The Accounting Review 54 (July): 554-562. Hickman, K. and G. Petry. 1990. A comparison of stock price predictions using court accepted formulas, dividend discount, and P/e models. Financial Management: 416-427. Howitt, I. 1998. Federal Tax Valuation Digest. 1996/1997 cumulative edition. Boston, MA: Warren, Gorham & Lamont. LeClair, M. 1990. Valuing the closely-held corporation: The validity and performance of established valuation procedures. Accounting Horizons: 31-42. Lev, B. and S. Sunder. 1979. Methodological issues in the use of financial ratios. Journal of Accounting and Economics: 187-210. Miller, M. and F. Modigliani. 1961. Dividend policy, growth and the valuation of shares. Journal of Business: 411433. Round, J. and D. Erickson. 1996. Family limited partnerships: The sum of the parts is not equal to the value of the whole. Published proceedings, Strasburger and Price L.L.P. Annual Tax Symposium, Dallas, TX. 25 FIGURE 1 Model summary Definitions: e=earnings per share b=book value per share P= price per share n= number of comparable firms in SIC code/year d=dividends per share a= total assets i=target firm j=comparable firms Proportional: Equal Weight Average P/e P/b n n .5 1 ( P / e ) * e + .5 1 ( P / b ) * b Pi j j i j j i n j 1 n j 1 j i j i Equal Weight Inverse Average e/P b/P 1 n Pi .5 [1 / ( e j / Pj )]* ei +. 5 n j 1 j i Equal Weight Median P/e P/b 1 n [1 / n ( b j / Pj )]* bi j 1 ji Pi .5 Median( P / e * ei ).5 Median( P / b * bi ) Equal Weight Ratio of Averages n n Pj Pj jj 1 jj 1 i i Pi .5 n *ei .5 n *bi e b j j j 1 j 1 ji ji Deflated Regression Weight Earnings and Book 1 1 b 1 1 Avgb P bi e 2 i Pi Pi e 2 1 Avge P The coefficients are estimated with regression without an intercept with the independent and dependent variables deflated by price. The b and e weights indicate the relative importance of book value versus earnings information, respectively. Linear: Deflated Linear Earnings and Book Deflated Linear Earnings, Book, and Dividends 1 0 1 0 b 1 e 1 i 2 i Pi Pi Pi 1 0 b 1 e d 1 i 2 i 3 i Pi Pi Pi Pi Deflated Linear Earnings, Book, Dividends, and Assets b 1 e d a 1 i 2 i 3 i 4 i Pi Pi Pi Pi Pi The coefficients are estimated using regression analysis with an intercept with the independent and dependent variables deflated by price. 26 TABLE 1 Sample Selection for Public Firms Number of firm/years 73,434 47,601 28,813 28,318 598 47 Criteria Complete earnings, book value, price, and number of shares from Compustat data At least 20 firms in 3-digit SIC code/year Positive earnings and book value Earnings, book and price below upper screens Number of 3-digit SIC code/years analyzed Average number of firms per SIC code Compustat Data for 1980 through 1992 is included in the analysis. Firm/years included must have complete price, earnings, book, and number of shares data. There must be at least 20 firms in the 3-digit SIC code for that year. Firm/years with negative earnings and/or book value are excluded. Firm/years with earnings above $9 per share or book value above $66 per share are also removed, which eliminates approximately the upper 1 percent of the earnings or book distributions. 27 TABLE 2 PANEL A Descriptive Statistics for Deflated Variables Intercept /Price Mean Variance .83 1.58 Earnings /Price .58 1.07 Dividends /Price .19 1.58 Book/Price .46 1.58 Assets/Price 3.08 152.87 TABLE 2 PANEL B Correlation Matrix for Deflated Variables Intercept /Price Earnings/Price Book/Price Dividends/Price Assets/Price .39 .36 -.06 .13 Earnings /Price Dividends /Price Book/Price .22 .02 .08 -.11 .02 .08 We use earnings before extraordinary items and book value of total stockholders’ equity. We define dividends as current earnings minus the change in book value. This measure is essentially all non-earnings changes in capital where stock issuances are viewed as negative dividends. However, if lagged book value is not available, we use actual dividends paid to avoid eliminating these firms from our sample. The statistics are based on the variables deflated by price because we use deflated variables for our regression models. 28 TABLE 3 PANEL A Pricing Coefficients and Percentage Pricing Error Statistics for Per Share Data of Public Firms (n=28,318) Percentage pricing error Mean RMSE Model Proportional Equal weight average P/e P/b Equal weight inverse average e/P b/P Equal weight median P/e P/b Equal weight ratio of averages Deflated regression weight e b Mean t-stats Linear Deflated linear e b Mean t-stats Deflated linear e b d Mean t-stats Deflated linear e b d a Mean t-stats Intercept Coefficients and t-statistics Earnings Book Dividends Assets 13.41 1.21 -1.18 4.41 6.10 .68 -.02 .84 7.01 .82 -.17 1.05 6.75 .83 -.18 1.07 .19 4.88 (3.79***) .53 (3.79***) .70 .59 (.12*) .60 (.21*) .66 (.29*) 5.09 (3.94***) 4.97 (3.72***) 4.80 (3.58***) .50 (3.44***) .51 (3.40***) .46 (3.09***) .15 .30 (.68**) .25 (.58**) .12 .01 (1.06**) .10 .94 1.12 1.23 The models are described in figure 1. P represents price, e represents earnings, b represents book, d represents dividends and a represents assets. The coefficients represent the amounts multiplied by the respective variables to estimate price. The reported coefficients are the global average after estimating the models for each of the 598 SIC code/years used in the analysis. The mean t-statistics across SIC code/years are reported in parentheses. *,**,*** indicate that the mean t is 2, 10, or 30 standard deviations from zero, respectively. The pricing error statistics are calculated for actual price minus predicted price deflated by actual price for the various models using per share data and out-of-sample estimation. The global mean deflated prediction error after estimating each model by SIC and year is reported. The RMSE represents the square root of the global mean of the squared residuals. 29 TABLE 3 PANEL B Percentage Pricing Error Pair-Wise Central Limit Z-Statistic for Per Share Data of Public Firms (n=28,318) Equal weight inverse average e/P b/P Equal weight inverse average e/P b/P Equal weight median P/e P/b Equal weight ratio of averages Deflated regression weight e b Deflated linear e b Deflated linear e b d Deflated linear e b d a Equal weight average P/e P/b -5.97 Equal weight median P/e P/b Equal weight ratio of averages Deflated regression weight e b Deflated linear e b Deflated linear ebd -5.86 7.13 -5.85 6.22 1.40 -6.02 -5.35 -6.73 -6.17 -5.88 -5.75 -5.64 .80 1.86 1.85 -.96 .54 .95 -1.12 .39 .85 1.87 2.62 2.37 1.96 1.82 1.53 A central limit z-statistic is used to make pair-wise comparisons based on the magnitude of the squared deflated prediction errors. The z-statistic represents the mean difference in the squared deflated prediction errors across two models divided by the standard error of the mean. A positive (negative) statistic indicates that the estimator in the column (row) provides more accurate price predictions. 30 TABLE 4 Private Firm Court Case Sample Year of Death and Industry Representation (n=31) Year of Death or gift 1964 1965 1967 1968 1969 1969 1970 1971 1972 1973 1974 1974 1976 1976 1976 1977 1979 1979 1980 1980 1981 1982 1982 1982 1982 1982 1983 1984 1985 1985 1986 Case identifier Court SIC Code Estate of Edward A. Tully... v. U.S. (1978) Edward J. Fehrs... v. U. S. (1979) Estate of Oakley J. Hall...v. Commissioner (1975) Fred. A. Berzon v. Commissioner (1975) James O. Driver... Estate of Margaret K. Burgi v. U.S. (1976) Estate of Victor P. Clarke... v. Commissioner (1976) Estate of Harry G. Stoddard v. Commissioner (1975) Estate of John L. Huntsman... v. Commissioner (1976) Fredrik G. H. Meijer... v. Commissioner (1979) Estate of Arthur F. Little...v. Commissioner (1982) Rudolph M. Maris v. Commissioner (1980) Estate of Ethyl L. Goodrich... v. Commissioner (1978) Albert L. Luce... v. U.S. (1984) The Northern Trust Co....v. Commissioner (1986) Estate of Howard W. Cook v. U.S. (1986) Estate of Stirton Oman v. Commissioner (1987) Estate of Charles B. Gillet v. Commissioner (1985) Estate of Saul R. Gilford... v. Commissioner (1987) Estate of Joseph Giselman... v. Commissioner (1988) The Fist National Bank...Estate of Captilles A. Lick, Jr. v. U.S. (1985) Estate of Clara S. Roeder Winkler... v. Commissioner (1989) Thomas E. Reilly... v. U.S (1988) Estate of Edwin Wallace Neff... v. Commissioner (1989) Estate of Milton Feldmar... v. Commissioner (1988) Estate of Elizabeth B. Murphy... v. Commissioner (1990) Estate of Catherine Campbell... v. Commissioner (1991) Estate of Joseph H. Lauder... v. Commissioner (1992) Estate of Mildred Herschede Jung... v. Commissioner (1993) Estate of Charles Russell Bennett... v. Commissioner (1993) Estate of Margaret A. Jann... v. Commissioner (1990) Estate of Bessie I. Mueller... v. Commissioner (1992) Claims Claims Tax Tax District Tax Tax Tax Tax Tax Tax Tax Claims Tax District Tax Tax Tax Tax District Tax District Tax Tax Tax Tax Tax Tax Tax Tax Tax 016 508 373 513 481 381 271 154 541 341 518 271 371 161 602 016 494 384 175 275 291 289 274 671 271 10 284 504 655 349 349 A sample of 31 court cases using private firm valuations for estate and gift taxes was collected from Commerce Clearing House’s Tax Court Memorandum Decisions (1975-1993) and Federal Tax Valuation Digest (1998) compiled by Idelle A. Howitt. The cases were required to have a date of death after 1960 and complete positive earnings, book value and number of shares data. The year in parentheses is the year of the opinion. The court column indicates whether the case was tried in the U.S. Tax Court, Claims Court or a District Court. 31 TABLE 5 Percentage of 31 Sample Cases Using Specific Variables, Ratios, or Valuation Methods Variables Earnings Book Dividends Ratio P/E Ratio P/B Ratio Valuation Method Comparables Adjusted Book Value Discounted Cash Flow Percentage 100% 90 68 84 48 87 52 32 The percentages represent the number of cases out of the 31 in our sample where the judge’s opinion mentioned that a specific variable, ratio, or method was used by the judge, taxpayers, or IRS to estimate the value of the private firm. 32 TABLE 6 Comparisons of Private Firms Tax Court Case Sample Valuations for Taxpayer Expert, Judge, and IRS Expert (n=31) Mean differences and (t-test) [Wilcoxon] tests of difference Judge IRS – Taxpayer IRS-Judge taxpayer 383 (2.02**) [3.53***] 383 (1.70*) [3.94***] 766 (2.27**) [4.55***] Taxpayer expert Judge IRS Expert Gross valuation before subtracting marketability discount Mean 567 950 1333 Median Std. deviation Marketability discount Mean 62 967 125 1777 147 2716 .32 .25 .21 -.07 (-2.53**) [-2.70***] -.04 (-3.93***) [-3.60***] -.11 (-4.28***) [-4.08***] Median Std. deviation .32 .16 .26 .07 .22 .07 Net valuation after subtracting marketability discount Mean 386 676 1046 290 (2.24**) [4.78***] 370 (2.06**) [4.70***] 660 (2.48**) [4.86***] Median Std. deviation 40 695 88 1243 110 2109 All dollar amounts are recorded as per share amounts. The numbers in parentheses are t-statistics from tests of differences in means using dependent sample t-tests. The numbers in brackets are z-statistics from nonparametric Wilcoxon matched pairs signed ranks tests. *, **, and *** represents significance at the .10, .05, and .01, level, respectively, using two-tailed tests. 33 TABLE 7 Comparison of Judge and Expert Average Valuation (n=31) T-test (Wilcoxon) test of difference Judge Expert average Differences judge – expert average Panel A: Gross valuation before subtracting marketability discount (n=31) Mean 950 950 .15 Median Std. deviation 125 1777 104 1809 .00 (.32) Panel B: 1980 or before gross valuation before subtracting marketability discount (n=11) Mean 1308 1442 -134 Median Std. deviation 294 2359 322 2602 -.40 (.45) After 1980 gross valuation before subtracting marketability discount (n=20) Mean 753 679 74 Median Std. deviation 91 1395 70 1177 1.26 (.71) All dollar amounts are recorded as per share amounts. t-statistics represent tests of differences in means using dependent sample t-tests. z-statistics in parentheses are from nonparametric Wilcoxon matched pairs signed ranks tests. *, **, and *** represents significance at the .10, .05, and .01, level, respectively, using two-tailed tests. Panel A reports results for all cases. Panel B separates the cases by whether they were decided after 1980. 34 TABLE 8 PANEL A Comparison of Valuation Models with Taxpayer Expert, Judge, and IRS Expert for Private Firm Court Cases without Bias Adjustment (n=31) Mean difference taxpayer vs. model -1.52 T-test (Wilcoxon) test of difference -4.33*** (-4.64***) -3.30*** (-3.51)*** Mean difference Judge vs. model -1.27 T-test (Wilcoxon) test of difference -3.53*** (-3.57***) -1.39 (-.99) Mean difference IRS vs. model -.84 T-test (Wilcoxon) test of difference -2.69** (-2.33**) 1.81* (2.10)** Model Equal weight average P/e P/b Equal weight inverse average e/P b/P Equal weight median P/e P/b Equal weight ratio of averages Deflated regression weight eb Deflated linear eb Mean (median) model 2.27 (1.73) 1.20 (1.14) -.45 -.20 .23 1.36 (1.29) 1.40 (1.26) .93 (.79) -.62 -4.23*** (-4.10***) -4.69*** (-4.21***) -1.82* (-1.16) -.36 -2.38** (-2.16**) -2.68** (-2.27**) .62 (1.78*) .07 .50 (1.06) .23 (.65) 4.73*** (3.76***) -.65 -.40 .03 -.19 .06 .50 .90 (.80) -.15 -1.55 (-1.12) .10 1.01 (1.55) .53 5.31*** (4.08***) Model valuations are determined by using the estimated coefficients for a sample of comparable public firms and applying them to the earnings and book value reported for the private firm in the court case. t-statistics represent dependent sample t-tests of mean differences between the adjusted model versus the expert or judge predictions. z-statistics represent nonparametric tests of difference using Wilcoxon matched-pairs signed ranks tests. *, **, and *** represent significance at the .10, .05, and .01, level, respectively, using two-tailed tests. 35 TABLE 8 PANEL B Comparison of Valuation Models with Taxpayer Expert, Judge, and IRS Expert for Private Firm Court Cases with Bias Adjustment (n=31) Mean difference taxpayer vs. model -.34 T-test (Wilcoxon) test of difference -.98 (.10) -3.15*** (-3.29***) Mean difference Judge vs. model -.09 T-test (Wilcoxon) test of difference -.25 (.94) -1.25 (-.88) Mean difference IRS vs. model .34 T-test (Wilcoxon) test of difference 1.10 (2.21**) 1.97* (2.20**) Model Equal weight average P/e P/b Equal weight inverse average e/P b/P Equal weight median P/e P/b Equal weight ratio of averages Deflated regression weight eb Deflated linear eb Mean (median) model 1.09 (.55) 1.18 (1.12) -.43 -.18 .25 1.19 (1.12) 1.22 (1.08) 1.12 (.98) -.45 -3.07*** (-2.84***) -3.40*** (-3.04***) -3.67*** (-3.82***) -.19 -1.27 (-.80) -1.48 (-1.06) -1.18 (-.31) .24 1.75* (1.82*) .23 (1.63*) 2.92*** (3.18***) -.47 -.22 .03 -.38 -.12 .31 1.05 (.95) -.30 -3.07*** (-2.88***) -.05 -.53 (-.06) .38 3.81*** (3.57***) Model valuations are determined by using the estimated coefficients for a sample of comparable public firms and applying them to the earnings and book value reported for the private firm in the court case. The model estimates are adjusted for the mean percentage pricing error reported in table 3 and then deflated by the judge’s price. t-statistics represent dependent sample t-tests of mean differences between the adjusted model versus the expert or judge predictions. z-statistics represent nonparametric tests of difference using Wilcoxon matched-pairs signed ranks tests. *, **, and *** represent significance at the .10, .05, and .01, level, respectively, using two-tailed tests. 36