Following on from the discussion of systematized EHD (equivalent hit dice) last week, let's look at a much earlier attempt at the same idea. In the first three issues of
White Dwarf magazine, Don Turnbull presented a measurement he called
"The Monstermark System". This would be through the summer and fall of 1977, that is, exactly 40 years ago as I write this. (Thanks to Stephen Lewis for the tip-off to these articles!)
In the third article in the series, Turnbull writes:
Although it has been said by quite a few D&D addicts that the Greyhawk system of experience points, which is based on monsters' hit dice, is too stingy I don't think this is something which can be considered in isolation... So, circuitously, back to experience points. In my view they are intended to reflect risk. A character gets experience for meleeing with a monster because there is a finite, non-zero, risk that he will be killed or at least suffer wounds which could contribute to his eventual death. He gets experience for gold because he has taken risks to grab it... He should not, however, get experience for finding a magic sword or that seven-spell scroll since these things will assist him in getting experience by other means... Since the whole point of the Monstermark is to measure the risk inherent in tackling a particular monster, experience points should bear a linear relationship to M...
I fully agree with those observations, and my motivation for EHD is exactly the same: to provide a measure of risk, from of which we can support a simple, linear calculation for experience points. We both assume a protagonist fighter with a fixed armor type, shield, and a sword; we both give the fighter one attack per round. Now, the basis of his system is this: for the default fighter, compute the
expected amount of damage he would expect to take fighting the monster (assuming the combat never ended from the fighter's death). In this case, the calculation is done by first computing the number of rounds the monster would expect to live (D); and then multiplying that by the expected damage per round (analogous to the DPS -- damage-per-second -- statistics in MOORPGs) for an overall aggression level (A). In the first article, Turnbull presents it like this:
This seems like a solid, undeniably valid base measure of monster risk level.
As long as the monster has no special abilities. Which is, as you know, almost none of them. As soon as a monster has special abilities, then Turnbull is forced to step out of the methodical expected-value analysis and revert back to a purely discretionary set of multipliers, hoping to estimate the power of various abilities, to get the final MonsterMark score (M). As he writes, "All this is very subjective and I would be surprised not to meet with different views, but the following bonus relationships seem to give results which instinctively 'feel' right:"
Now, if you take nothing at all but one thing away from this blog, I hope that it's this: these kinds of
a la carte scoring systems for game entities are always a lost cause.The inter-relationships of different abilities and powers are too complicated to be encapsulated in such a system; the true acid test can only be made by systematic playtesting (which is very hard).
Consider a few short counterexamples -- A giant rat given magic-to-hit defense is effectively unbeatable by the PCs it normally fights; but a very old red dragon, given the same ability, would have little effect against its high-level opponents (surely wielding magic weapons already). If ghouls have possibly paralyzing attacks, then it makes a huge difference if they have one attack for 1d6 damage, versus three attacks for 1d2 damage (even with nearly the same expected damage). Centipedes and carrion crawlers, with a base damage of zero, even with poison or paralysis, would generate a product that is still zero by this multiplicative system. And so on and so forth.
Nevertheless, Turnbull pushes forward with the tools he has, first presenting a table of basic humanoids without special abilities (of which there's really only a half-dozen), and then separate tables for various other categories of monsters from OD&D, the
Greyhawk supplement, and a few magazine articles current at the time. For a few examples of his
M scores: orcs get 2.2, ogres 29.9, trolls 158.4, and red dragons 675.5 (by comparison, I give those creatures EHD values, respectively, of 1, 4, 9, and 32; and no, I don't think that going into decimals here is a great idea). Ultimately he recommends giving XP of 10 times his M score, which is generally about double the low
Greyhawk XP awards for these sample creatures (whereas I still prefer 100 times the EHD level, in the spirit of Vol-1).
There are 73 monsters for which Turnbull & I both are willing to give measurements. Consider the correlation between our assessments:
That's not very close at all. The data points are scattered all over the place, not close to any regular relationship; knowing one measure only allows you to predict about 50% of the variation in the other measure. On average, Turnbull's Monstermarks are about 20 times what I find for EHD levels, but that doesn't tell us much. He assumes plate armor for fighters whereas I assume chain (for reasons given last week), but that can't explain the low correlation either. Let's look at some specific cases for why this is.
The most obvious problem for Turnbull is this:
The Monstermark system cannot handle area effect abilities at all. His model tries to do accounting on the hit points from breath weapons (in the 2nd article), but he steadfastly assumes just a single deathless fighter in melee against a given monster; so, if a red dragon breathes fire, then only damage to that one fighter is accounted. But that doesn't reflect the true risk or utility of area-effect weapons like that; our PCs don't adventure in solitude but in groups of some size. The examples of dragon combat in both OD&D and AD&D show three PCs being incinerated at once from a single breath attack; so the damage/risk multiplier should really be at least several times higher than Turnbull counts. Likewise, petrification weapons get no distinction for delivery by touch or wide-area gaze -- the cockatrice (touch), medusa (gaze), and basilisk (both!) each get an identical 2.5 multiplier for their abilities. This alone probably accounts for a massive skewing in many of his scores, downward from the true risk level. In contrast, my Monster Metrics program runs up to 64 opposition fighters simultaneously against any given monster, and they suffer appropriately from area or gaze weapons.
Some examples where the Monstermarks seem clearly too low:
- Basilisk (EHD 25, MM 128), with its combined touch-and-gaze petrification, which only gets the same multiplier as a cockatrice does.
- Medusa (EHD 13, MM 56), likewise with her area-effect gaze petrification.
- Carrion Crawler (EHD 14, MM 120); as noted above, the multiplication system from zero damage should come out to zero, so I think he just made this up from whole cloth (note the round number).
- Harpy (EHD 9, MM 22), with her mass charm song ability, shouldn't be weaker than an ogre.
Another rather egregious issue is this, although it affects only two creatures:
Summoning abilities are entirely left out of the accounting. As noted before, we find these abilities to be among the
most potent in the game! But the Monstermark system actually overlooks them entirely, giving no bonus at all for them.
- Vampire (EHD 39, MM 440), given no summoning abilities.
- Treant (EHD 33, MM 420), which actually appears in Turnbull's first table of "simple human-type monsters" without any special abilities, and yet its tree-controlling ability allows it to effectively triple its own brute strength. (As an aside, consider a vampires-vs-treants scenario, in which we find two of the most powerful opposition monsters in the game due to their parallel summoning abilities.)
Meanwhile, there are some other monsters with nothing but brute strength that appear too highly scored -- like the Fire Lizard (EHD 14, MM 758), and Hydra with 10 heads (EHD 18, MM 707) -- but I think that this is only an artifact of the special ability monsters being relatively too low. Also, the Mind Flayer's score seems ridiculous (EHD 20, MM 700), granted that he doesn't even note its mind blast power, and was probably again just a raw guess (another suspiciously round number).
Now, there are two other cases that literally jumped off the chart above, such that I felt compelled to remove them as outliers -- and on inspection they are
rather obviously in error. These were:
- Roper (EHD 16, MM 3,750). This is clearly a mistake. Turnbull notes the creature in part 2, p. 15: "These calculations make the Ropers the most fearsome beasts we have met so far; I don't recall ever meeting them down a dungeon, and I devoutly hope I never will." The problem, if I'm reading his attack notation correctly, is that he's applied the Roper's 5d4 damage factor -- which should be just for its mouth -- to every single one of its 6 ranged tentacle attacks. That really would be horrifying! While the Roper is a tough customer, it obviously shouldn't be worth the same as 5 or 6 Red Dragons; that doesn't pass any kind of sanity check.
- Flesh Golem (EHD 21, MM 1,920). In this case, the problem is that Turnbull shows a radically different AC for the monster than I see in the books: My copy of Sup-I (with correction sheet) gives it AC 9, as does the AD&D Monster Manual. Turnbull shows it has having an AC of -1, which is obviously the diametrical opposite. I'm not sure where he got that from, maybe from a wild guess before the Sup-I correction sheet was available to fill in that statistic?
There were some other things I had to leave out of the analysis, such as those other golems and elementals that are hit by only +2 or better magic weapons, which have undefined EHD in my model. Turnbull gives medium and large elementals a score of 1,000-2,000, stone golems nearly 13,000, and iron golems just shy of 33,000 (but again their ACs are treated as much harder than in the rulebooks, namely AC -3 and -5, so there are multiple reasons to leave them out of our comparison).
In conclusion, while the motivations are exactly the same, the scores that Turnbull & I come up with a radically different, effectively incommensurable. (If you want the full data, my
Monster Database from last week has Turnbull's MonsterMarks entered in hidden column Q.) Of course: while Turnbull's instinct was noble, he didn't have the immense computing power all around us to simulate playtests the way we can today. Now, maybe someone will come back to critique my work in another 40 years -- someone who has access to a complete game engine with all the special abilities, full wizard spell selection, mixed-class PC party simulator, and hard Artificial Intelligence to optimize the best tactical choices on each side -- and in that light my suggestions might look totally naive. We can only hope for such continuity and progress.