Consider the basis for how MonsterMetrics does its job (recently clarified in the code). We want to measure monster potency over a wide range of possible PC levels. So for each level 1-12, we compute the number of fighters that comes closest to an even fight against that monster (that is, closest to a 50% chance of either side winning; this itself entails a binary search across each possible number of fighters 1-64, or the inverse, with possibly hundreds or thousands of simulated combats at each possible number to assess the chance of victory on each side). This generates an array I call Equated Fighters (EF); and each element therein is multiplied by the level-index to generate a value called the Equated Fighters Hit Dice (EFHD).

Example: Consider the standard Ogre. We find that one is fairly matched against 2 Veterans/Warriors, 1 Swordsman/Hero/Swashbuckler, ½ Myrmidon/Champion/Superhero/Lord (that is, 2 ogres against one Superhero is fair), and so forth. So we have an Equivalent Fighter array that looks like this: EF [2.0, 2.0, 1.0, 1.0, 1.0, 0.5, 0.5, 0.5, 0.5, 0.3, 0.3, 0.3]. Multiplying each value by its associated level gives us the Equivalent Fighter Hit Dice; spot-checking a few simple values, we see that an Ogre is symmetrically worth 4 HD of Warriors (2 × 2HD each), or 4 HD of Heroes (1 × 4HD each), or 4 HD of Superheroes (½ × 8HD each), etc. Not to say that every point is exactly the same; due to the discrete nature of the matchups, there is a bit of a sawtooth artifact in the values. In total we have this array for the Ogre: EFHD [2.0, 4.0, 3.0, 4.0, 5.0, 3.0, 3.5, 4.0, 4.5, 3.3, 3.7, 4.0], and this is shown pictorially below.

Now, given that we want to present a single number to represent “monster power”, the question is, what metric do we use to crunch those numbers down to a unitary value? Granted that the EFHDs in the example above are approximately the same, this is good news for the overall idea of making a single, summary value in the first place; that is, in the graph above, they all fall roughly along a horizontal line. This is generally true for most monsters, but not all. Monsters with save-or-die abilities that short-circuit PC hit points have EFHDs that trend upward (positive slope); while monsters with area-attacks that can kill lots of low-level fighters have EFHDs that trend downwards (negative slope).

Previously I was computing an overall monster Equivalent Hit Dice (EHD) by taking the arithmetic mean of the EFHD values. That's an obvious choice, but one that has notable limitations; the arithmetic mean is sensitive to outliers, so if a monster is unusually dangerous versus PCs of one specific level, then the EHD can be wildly biased in that direction. As an extreme case, some monsters can actually face off against an infinite number of 1st-level attackers – e.g., Golems and Elementals are hit only by +2 or better magic weapons, which mere Veterans will not have – and in such a case, the mean of the EFHDs itself becomes infinite. As a result, I was forced to mark those monsters as having undefined EHDs (asterisks in the OED Monster Database).

So: Here’s where the change comes in. What other options do we have for computing an average (measure of center) of the EFHDs? The median could be considered – it’s less sensitive to outliers, but would suffer the same fate if, in theory, half of the PC levels had infinite EF’s (it has a 50% breakdown point). Instead, the silver bullet seems to be the harmonic mean, that is, the reciprocal of the average of the reciprocal EFHDs. The neat thing here is that the harmonic mean actually has no problem handling infinite values (with the understanding that 1/∞ = 0). It’s dominated by a multiple of the minimum value (see here), so the only way the harmonic mean can become infinite is if every single EFHD value is infinite (that is, the monster would be unbeatable by PCs of any level; in which case we can interpret the non-real EHD as communicating the fact that awarding XP is a non-consideration).

If the EFHD values are all equal, which is approximately true for most monsters, then the harmonic mean is exactly the same as the arithmetic mean – so we won’t get wholesale changes in our EHD values. For example: The Ogre above has an EFHD harmonic mean of about 3.5; so I set the EHD value to 4, the same as before (and the same as its actual hit dice, which is nice). On the other hand, the harmonic mean is always lesser than or equal to the arithmetic mean, so in some cases, for high-level monsters with skewed power curves, the suggested EHD was reduced by a few pips. For example: The Red Dragon presented the most extreme case of this, in which its EHD dropped from 32 to 27 (a 5-point difference). I think that this bias in the downward direction may be reasonable for a few reasons: (a) it emphasizes the most likely use-case of opposition PCs (that is: the most effective PC level), and (b) it suggests that accounting for magic spells and the like may reduce the overall danger level of such monsters.

Most importantly, this change in methodology allowed us to fill in the previously-missing EHD values for monsters like Golems and Elementals (the tricky cases mentioned above). The most dangerous monster by this metric is now seen to be the Iron Golem from D&D Supplement I, with a devastating EHD value of 104! (All other monsters top out at around 40 EHD: c.f., the Stone Golem at 41, Large Earth Elemental at 39, Vampire at 38, etc.). This allowed me to cut the number of monsters with "undefined" EHDs by more than half (adding 14 monster types to the list); those that are left are those with copious spell-like abilities (lich, titan, beholder, etc.), are invulnerable to any fighter weapons (oozes), or have attacks that don’t actually kill people (rust monster).

A bunch of stuff got updated to follow along with this method of using the harmonic mean on EFHD values. See the code on GitHub. In particular, the MonsterMetrics.java module received new command-line switches:

**-e**to see the Equated Fighters array,

**-d**to see the Equated Fighters Hit Dice Array, and

**-g**to see the latter graphed as ASCII text. I uploaded the output for all monster evaluations and graphs (at 5 units per step on the y-axis). The OED Monster Database has been updated, and so has the Monster EHD Listing (including sorting by both name and EHD value). And more to come next week.

Thanks to CUNY/Kingsborough Professor of Mathematics Nataniel Greene for suggestions that clarified my problem and led me in this direction.

So the EFHD is dominated by the minimum value... that could be an issue if it results in some unexpectedly difficult encounters, in the "I thought this would be a fun encounter for this group, but it turned out to be nigh impossible for them" vein.

ReplyDeleteAlso would it maybe make sense to restrict the range of levels tested against to only around those likely to encountered in the WM tables, or those likely to actually attempt to fight it? e.g. if it needs a party of 20+ PCs to even stand a chance, then the correct procedure is to run away or attempt to bargain, anything but fight. Admittedly they may want to lead the peasant army, so you'd want to know how many of those it would take, but that seems like a unusual situation.

On the latter point: I agree, that certainly could have been the approach taken here. I sort of considered it as the first option. But in the context of a semi-randomized sandbox I felt it better to look at a range of levels than just one (or a few), as a reflection of what's more likely to happen on the ground.

DeleteOn the first point, I think that's not much of an issue because when the term "dominated" is used here, it's actually maximum of the product of all the levels times the minimum value. Say, if there's one EFHD entry of 3, then the harmonic mean (EHD) can't be more than 3 × 12 = 36 no matter what the other entries are. That's a sufficiently liberal threshold that I don't think any monster in the system actually hits. In other words: the EFHD entries are indeed generally "similar enough" that this isn't a problem.

I just added a scan in the program for that first-point issue, and it appears that no monster in system comes even halfway towards that theoretical "dominated" threshold.

DeleteCool -- sorry to have given you extra work, but probably better off knowing!

DeleteI agree! It was only like 5 min of work. :-)

DeleteGreat work advancing the art here, Delta!

ReplyDeleteOut of curiosity, how far have you been tempted down the path of writing combat AI for the hard to model monsters so far?

At least a few days a week I consider pursuing a PhD to make that happen. Not a joke.

DeleteYou never fail to amaze and impress!

ReplyDeleteAny chance you will use this method to compute the EHDs for the denizens of the Monster Manual that don’t appear in the LBBs?

And as for the PhD — Dr. Delta!