Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

# Errors on Percentage Errors

DZone 's Guide to

# Errors on Percentage Errors

· Big Data Zone ·
Free Resource

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

The MAPE (mean absolute per­cent­age error) is a pop­u­lar mea­sure for fore­cast accu­racy and is defined as

where denotes an obser­va­tion and denotes its fore­cast, and the mean is taken over .

Arm­strong (1985, p.348) was the first (to my knowl­edge) to point out the asym­me­try of the MAPE say­ing that “it has a bias favor­ing esti­mates that are below the actual val­ues”. A few years later, Arm­strong and Col­lopy (1992) argued that the MAPE “puts a heav­ier penalty on fore­casts that exceed the actual than those that are less than the actual”. Makri­dakis (1993) took up the argu­ment say­ing that “equal errors above the actual value result in a greater APE than those below the actual value”. He pro­vided an exam­ple where and , so that the rel­a­tive error is 50÷150=0.33, in con­trast to the sit­u­a­tion where and , when the rel­a­tive error would be 50÷100=0.50.

Thus, the MAPE puts a heav­ier penalty on neg­a­tive errors (when ) than on pos­i­tive errors. This is what is stated in my text­book. Unfor­tu­nately, Anne Koehler and I got it the wrong way around in our 2006 paper on mea­sures of fore­cast accu­racy, where we said the heav­ier penalty was on pos­i­tive errors. We were prob­a­bly think­ing that a fore­cast that is too large is a pos­i­tive error. How­ever, fore­cast errors are defined as , so pos­i­tive errors arise only when the fore­cast is too small.

To avoid the asym­me­try of the MAPE, Arm­strong (1985, p.348) pro­posed the “adjusted MAPE”, which he defined as

By that def­i­n­i­tion, the adjusted MAPE can be neg­a­tive (if ), or infi­nite (if ), although Arm­strong claims that it has a range of (0,200). Pre­sum­ably he never imag­ined that data and fore­casts can take neg­a­tive val­ues. Strangely, there is no ref­er­ence to this mea­sure in Arm­strong and Col­lopy (1992).

Makri­dakis (1993) pro­posed almost the same mea­sure, call­ing it the “sym­met­ric MAPE” (sMAPE), but with­out cred­it­ing Arm­strong (1985), defin­ing it

How­ever, in the M3 com­pe­ti­tion paper by Makri­dakis and Hibon (2000), sMAPE is defined equiv­a­lently to Armstrong’s adjusted MAPE (with­out the absolute val­ues in the denom­i­na­tor), again with­out ref­er­ence to Arm­strong (1985). Makri­dakis and Hibon claim that this ver­sion of sMAPE has a range of (-200,200).

Flo­res (1986) pro­posed a mod­i­fied ver­sion of Armstrong’s mea­sure, defined as exactly half of the adjusted MAPE defined above. He claimed (again incor­rectly) that it had an upper bound of 100.

Of course, the true range of the adjusted MAPE is as is eas­ily seen by con­sid­er­ing the two cases and , where , and let­ting . Sim­i­larly, the true range of the sMAPE defined by Makri­dakis (1993) is . I’m not sure that these errors have pre­vi­ously been doc­u­mented, although they have surely been noticed.

Good­win and Law­ton (1999) point out that on a per­cent­age scale, the MAPE is sym­met­ric and the sMAPE is asym­met­ric. For exam­ple, if , then gives a 10% error, as does . Either would con­tribute the same incre­ment to MAPE, but a dif­fer­ent incre­ment to sMAPE.

Anne Koehler (2001) in a com­men­tary on the M3 com­pe­ti­tion, made the same point, but with­out ref­er­ence to Good­win and Lawton.

Whether sym­me­try mat­ters or not, and whether we want to work on a per­cent­age or absolute scale, depends entirely on the prob­lem, so these dis­cus­sions over (a)symmetry don’t seem par­tic­u­larly use­ful to me.

Chen and Yang (2004), in an unpub­lished work­ing paper, defined the sMAPE as

They still called it a mea­sure of “per­cent­age error” even though they dropped the mul­ti­plier 100. At least they got the range cor­rect, stat­ing that this mea­sure has a max­i­mum value of two when either or is zero, but is unde­fined when both are zero. The range of this ver­sion of sMAPE is (0,2). Per­haps this is the def­i­n­i­tion that Makri­dakis and Arm­strong intended all along, although nei­ther has ever man­aged to include it cor­rectly in one of their papers or books.

As will be clear by now, the lit­er­a­ture on this topic is lit­tered with errors. The Wikipedia page on sMAPE con­tains sev­eral as well, which a reader might like to correct.

If all data and fore­casts are non-​​negative, then the same val­ues are obtained from all three def­i­n­i­tions of sMAPE. But more gen­er­ally, the last def­i­n­i­tion above from Chen and Yang is clearly the most sen­si­ble, if the sMAPE is to be used at all. In the M3 com­pe­ti­tion, all data were pos­i­tive, but some fore­casts were neg­a­tive, so the dif­fer­ences are impor­tant. How­ever, I can’t match the pub­lished results for any def­i­n­i­tion of sMAPE, so I’m not sure how the cal­cu­la­tions were actu­ally done.

Per­son­ally, I would much pre­fer that either the orig­i­nal MAPE be used (when it makes sense), or the mean absolute scaled error (MASE) be used instead. There seems lit­tle point using the sMAPE except that it makes it easy to com­pare the per­for­mance of a new fore­cast­ing algo­rithm against the pub­lished M3 results. But even there, it is not nec­es­sary, as the fore­casts sub­mit­ted to the M3 com­pe­ti­tion are all avail­able in the Mcomp pack­age for R, so a com­par­i­son can eas­ily be made using what­ever mea­sure you prefer.

Thanks to Andrey Kostenko for alert­ing me to the dif­fer­ent def­i­n­i­tions of sMAPE in the lit­er­a­ture.

Topics:

Comment (0)

Save
{{ articles[0].views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.