Monday, October 31, 2011

Book of War: Pricing and Simulation

I'm guessing that price-balancing the different unit types took about 80% of the overall work effort in the Book of War project. Something like that. (Note: Halloween content at the bottom of this post.)

Previously I pointed out I how I felt compelled to write a simulator program for mass D&D combat at the man-to-man scale, so that we could carefully investigate what the overall trends of that kind of action would be. (Here.) But even more importantly than that, and in fact predating that by a few years, I also wrote a simulator for the full Book of War game itself. While I was working on the game, I was usually ping-ponging between book text and simulator program, to make sure that things were working out in a reasonable way. This allowed me to run lots of head-to-head battles at proposed price levels, to try and get things cost-balanced as much as possible -- like, my estimate is that approximately 2 billion simulated games of Book of War have been run to date for that purpose. Here's the simulator program, for your consideration (in Java; under GPL v.2 license):


Now, pretty much everything I could practicably include from the game is in that simulator program. It handles all the the different unit types, figure sizes, terrain, weather, morale, most special attack forms, monsters, heroes, wizards, etc. There are switches for different core assumptions and optional rules, and the ability to either look at one detailed game at a time, or assess the overall results of several thousand battles (with some statistics thrown in to make it more efficient). Some limitations: It only throws one homogenous unit type face-on at another unit type at a time; it presumes a board covered with just one terrain type; it doesn't handle wrapping around unit edges; and stuff like that. As usual, there's no GUI, so you'll need a Java development kit to tinker with it. But I think as a primary first-pass for checking unit prices it's a pretty powerful tool.

My primary strategy when price-balancing was something like this: Initial values were set as per the Internet Medieval Sourcebook (currently offline), which are also quite close to the classic D&D men-at-arms costs (Vol-3, p. 23). Simulations used a random budget for each side in the range of 50-200 points (you need a variable range, because a fixed price-point will show biased artifacts from one unit type having a disadvantageous remainder). For a particular group of units, create a cross-matrix of every unit battling every other unit type some thousands of times (in each case with starting range, terrain, and weather randomized). At the end, report the percentage of times that each unit beats every other; and add up the number of favorable matchups (i.e., cases where a unit wins over 50% of the time). I generally wanted these "win" matchups to be about half the number of opposing unit types -- but arranging exactly that was impossible, and even getting them vaguely close was pretty hard. There was a lot of tweaking a unit type by 1 or 2 points, which would change all the other matchups, finding one or two that were then unacceptable, and needing to re-tune again and again. If I ever changed an actual game rule, then the whole balancing process had to be re-done again from scratch. This was certainly the hardest part of the game-design process, but at the end of it I was pretty happy with the results.

For example, here is the output from the matchups among just our "core melee types": pikes (P), medium swords (S), and heavy cavalry (C). Percentages of 50 or more are shown when the unit on the left tends to win against the unit on top:


Assessed win percents (budget 50-200):

.. P. S. C. Wins
----------------
P. -- -- 62 1
S. 52 -- -- 1
C. -- 55 -- 1



So what this shows is that, as desired, there is no single "best" unit type in this trio. Probabilistically speaking, on a pro-rated-cost basis, cavalry tends to lose to pikes (62% of the time); pikes usually lose to swords (52%); and swords generally lose to cavalry (55%). Compare to the desired "Close Combat Trinity" graphic from the post last Jan-3 (which really laid out our game-design goals for the year, as it were; shown again to the right).

An important part of these results is that they assume the terrain and weather percentages as shown in the Book of War rules. Terrain frequency has been carefully set so as to have this balanced trinity appear in the results (for example: rolls for terrain are fixed at one-quarter of the table space in square feet, which is actually based on both core unit balancing and aesthetic considerations for how the board would best look). Cavalry does best in open terrain, and so do pikes; but swords gain an advantage fighting the others in any kind of unusual terrain. (Or in other words, dial up the rough terrain frequency and swords appear to do better against the other types more of the time.) There's also an estimated probability for opponents flanking pikes and avoid their defensive benefit (dial this assumption up and pikes do less well against any other type). Over all possible combinations of these factors, we get the win probabilities shown above.

Next, here are more assessment results when we add some missile troops to the mix: longbows (L) and horse archers (HA):


Assessed win percents (budget 50-200):

.. P. S. C. L. HA Wins
----------------------
P. -- -- 63 -- 50 2
S. 53 -- -- 50 -- 2
C. -- 55 -- 70 66 3
L. 59 50 -- -- 51 3
HA 51 56 -- -- -- 2



If we analyze this under the game-theoretical principal of strategic dominance, then we are again happy, because no unit type is "dominated" by any other -- that is, there is no clear "suboptimal" choice (in fact, every type is the "best"response to some particular selection by the opponent, except for swords, which are themselves the only type that can generally beat both pikes and longbows).

And here we look at all 12 historical types in the Book of War "Basic Game" rules (4 types of infantry; 4 types of archers; and 4 types of cavalry):


Assessed win percents (budget 50-200):

.. LI MI HI P. A. L. C. HC LC MC HC HA Wins
-------------------------------------------
LI -- -- -- -- -- -- -- -- -- -- -- -- 0
MI 67 -- 50 53 -- 50 -- 50 55 51 -- -- 7
HI 64 51 -- 56 -- 54 -- 51 51 -- -- -- 6
P. 57 -- -- -- -- -- -- -- 70 68 63 50 5
A. 62 57 53 62 -- -- 80 83 -- -- -- 52 7
L. 55 50 -- 58 63 -- 72 79 -- -- -- 51 7
C. 68 63 59 65 -- -- -- -- 54 52 -- 63 7
HC 57 -- -- 55 -- -- 66 -- -- -- -- 60 4
LC 52 -- 50 -- 51 62 -- 61 -- -- -- 61 6
MC 55 -- 51 -- 54 63 -- 57 51 -- -- 61 7
HC 61 55 59 -- 63 70 56 65 67 61 -- 67 10
HA 61 56 56 51 -- 50 -- -- -- -- -- -- 5



Now, in this case, some types are "dominated" (apparently always-bad choices), and it was impossible to avoid that. Here's one example: In the first row, light infantry generally lose to any other unit type; this was unavoidable, because if we reduce the cost of light infantry in the game by even 1 point, then suddenly they'll actually beat every other unit in the game (an issue with the granularity of a unit that cheap). But even with the price at which we set them, they're still potentially useful in the game as a scouting, delaying, rough-terrain force (in fact, my girlfriend has used them to excellent results several times).

Nonetheless, let's go through a process of taking out units that clearly have some other unit with better results in all cases (that is, IEDS -- iterated elimination of dominated strategies):


Assessed win percents (budget 50-200):

.. MI HI P. A. L. C. HC Wins
----------------------------
MI -- 50 52 -- 50 -- -- 3
HI 50 -- 56 -- 53 -- -- 3
P. -- -- -- -- -- -- 63 1
A. 56 53 62 -- -- 80 -- 4
L. 50 -- 58 63 -- 73 -- 4
C. 62 60 65 -- -- -- -- 3
HC 55 59 -- 62 70 57 -- 5



So what's been eliminated here are: Light infantry, light cavalry, medium cavalry, horse archers, and heavy crossbows -- these types are not optimal in a frontal clash, while the remaining types clearly have some kind of sticking value in that situation. But do notice that even the types we apparently eliminated under this analysis generally have high mobility, and might be useful in strategic ways outside the scope of our simulator (like splitting multiple enemy units apart, or hit-and-run tactics, or gaining the flank or rear of an enemy, etc.) . So you should consider using these results as one tool in your game strategy guide, but certainly not the only one.

Another thing to take into account: The indicated win percentages skew differently for higher or lower budget sizes. For example, I picked the base budget size of 50-200 as a guess for the nominal largest single unit size that you'd see for the basic game. (Although when pricing more expensive monsters, heroes, and wizards, I had to increase the budget amounts or else in some cases you couldn't afford even a single figure!) Here's a look at some other budget levels and the effects thereof, just for the "core +2" unit types:


Assessed win percents (budget 20-100):

.. P. S. C. L. HA Wins
----------------------
P. -- 58 66 -- -- 2
S. -- -- 50 -- 56 2
C. -- 50 -- 68 66 3
L. 63 57 -- -- -- 2
HA 52 -- -- 50 -- 2


Assessed win percents (budget 50-200):

.. P. S. C. L. HA Wins
----------------------
P. -- -- 65 -- -- 1
S. 52 -- -- -- 61 2
C. -- 55 -- 74 78 3
L. 62 54 -- -- 50 3
HA 51 -- -- 50 -- 2


Assessed win percents (budget 100-500):

.. P. S. C. L. HA Wins
----------------------
P. -- -- 64 -- -- 1
S. 58 -- -- -- 57 2
C. -- 72 -- 67 74 3
L. 67 57 -- -- 50 3
HA 52 -- -- 51 -- 2



So some of the balancing might be off if the assumptions regarding total unit and army size are wrong, but that was unavoidable lest we mandate some fixed unit point size. In any event, the overall tactical situation, unique placement of terrain, and combination of forces within an army will present nearly endless opportunities for finding new ways of units interacting with and against each other. That said, the price-balancing simulator provides a pretty solid initial foundation to the game.

As a final thought, this simulator program is intended to stand as the "acid test" for pricing new units in the Book of War game. (In general, converting from D&D stats to BOW is absolutely trivial -- but getting the price correctly balanced is entirely a different story.) I would never want to present a table of abilities and piecemeal values to be added up; the ways in which different stats and abilities synthesize pretty much totally preclude that. (See Monte Cook's famous lesson in that regard as to the 3E new-magic-item table.) What should be done, at least as a first pass, is to run new units through this simulator and see how they match up against existing types; manual tweaking for novel abilities and playtest feedback can then follow. (Now that I think of it, with some work the simulator probably could be adjusted to take a new unit and spit out a recommended price value, granted the existing basic types as known baseline; but it couldn't be done for the initial group, which would be inherently self-referential; and that particular work will have to wait for another day.)

Here are text-file versions of the assessment results shown above:



Scary Halloween Special!: Check out the OEDGames.com website for some free "routed" markers for your Book of War game. Put one of these down next to an enemy unit, and watch how fast they run off the board in abject terror! (Offer not applicable in states with trolls or the mindless undead.)

Friday, October 28, 2011

Friday Night Book of War

Here's the household game from last weekend, which happened to involve a lot of pikes and missile units:


Start -- Basic Rules; 200 points. What I've done (at bottom, in red) is take 25 figures of pikes and 15 light crossbows. These are among the cheapest units in the game (cost 5 each), so there's a good many figures on the board for this point-level. I used the same army previously against my good friend BostonQuad, serving him one of the most crushing defeats seen to date (he came at me using all light cavalry, with bad results). Opposition has actually guessed what I might use tonight (based on my blog topics for the week), and has therefore entirely forgone any cavalry -- she's picked a big unit of heavy crossbows (top left) and a whole bunch of light infantry, arranged in small skirmish-type units (top right). Terrain is 1 Hill and 2 units of Woods (plus one "Open" result, which had no effect on the board). I have set up and will move first.



Turn 1 -- At bottom, I've pushed all of my units as far forward as possible. In response, opposition has brought the heavy crossbows a full move forward (they're a bit slower, so couldn't entirely get up the hill), and something unexpected -- the light infantry have all done a right-face and moved in that direction, towards the heavy crossbows on the hill. No attacks were possible on this turn.



Turn 2 -- Disaster for blue! On my turn I've again pushed the pikes forward as far as possible -- in particular charging over the hill to attack the heavy crossbows. Note that in this location, pikes have no capacity for any special defensive strike (although they get the usual 1-die per figure attack on my turn). Crossbows on the right have made partial moves forward, shot down a few light infantry figures at medium range, and those units have both routed. Much more horribly for my opponent, the charging pikes eliminated 2 heavy crossbows, and they routed as well (their morale dice of 3 are shown in the picture; any roll of 4+ would have succeeded).



Turn 3A -- Blue is now in an atrocious position. Two units have run off the table, her other units are all clogged up in the top-left corner of the board (blocking each other), and I've struck the rear of the routing heavy crossbows, eliminating about half of them. Crossbows are picking off more infantry, and other pike units are boxing off any escape.



Turn 3B -- Blue's heavy crossbows have now routed off the table. Purely in desperation, she has sent several units of light infantry at my pikes, just off the northern edge of the hill. The infantry on the left flank managed to kill 2 of my figures; but the infantry from the right-front ran into my double-defensive attack, and they all perished.



Turn 4 -- Those hard-charging pikes have now attacked towards the top left, and killed all of the light infantry unit there. Crossbows have killed all of the unit that tried to attack my rear at the base of the hill, and other units are closing in, too. Opposition has only one unit of light infantry remaining, near the left edge of the board. With blue in a clearly hopeless position, we agreed to end the game at this point (presumably that last infantry escapes off the board-edge). Victory for me! Although, in compensation I have to pay for Chinese tonight, do all the dishes, and clean the cat's teeth.



Postscript -- What you've just witnessed is the single most-lopsided Book of War game that any of us have ever seen. Ultimately it was that abysmal morale-check for the big heavy crossbow unit on Turn 2 that snapped the opposition's backbone, and her best chance against my pikes. I think the main lesson we can take from this game is that (a) the combined pike/crossbow force is clearly the best in the game, and (b) I am a frankly brilliant strategist and wargamer. (Although there's some disputing opinion that I've gotten immensely lucky 2 weeks in a row; keep in mind that over the summer my girlfriend won 10 games in a row without me being able to beat her once. But I'll call that a minor detail.)

There are, however one or two other lessons here about Book of War play. In most cases, aggression is rewarded; getting the first hit and at least forcing the opponent to roll a morale check is very desirable. And regarding pikes, although their listed special ability is purely defensive, this actually makes them a fantastic offensive weapon. Their move rate is high (12") , so you can rapidly close towards the enemy (like here: getting through the woods before they could be caught there), and the opposition rarely wants to deal with them frontally. So although the defensive benefit is almost never actually triggered (think Chainmail), the enemy tries to get away from them and is thus thrown into disarray. Frequently you get a free single attack by the pikes, and then the enemy opts to run instead of fighting back (and this is exactly the sort of action described by Plutarch at the Battle of Pydna, etc.).

Note that I was even willing to send the pikes attacking up the hill, where they would lose their defensive benefit, which tends to surprise opposing players. In this particular case that shouldn't have worked; I expected the heavy crossbows to succeed morale and then melee, with greater numbers and heavier armor, winning the hill from the pikes. But while that played out they wouldn't have been shooting, and I could move against the rest of her army. As it turned out, that one aggressive stroke was fundamentally all that was necessary to win this particular game.

One other thing: For simplicity, Book of War handles only homogenous units (no mixing different types in the same unit), but you can still arrange a line of missile troops behind infantry and get them to work together, as an emergent property of the archery rules (see how I arranged pikes & crossbows on my starting left in the first picture). Archers can fire over another unit at -1 to hit (and this penalty would go away if they get on a hill at a higher altitude, which you may notice I was running for; see BOW p. 7). Missile attacks back would be at -1, and also have half the attack dice applied to the infantry close in front (acting by default as a kind of "shield man"). But this happened to not come into play in this game.

Maybe I shouldn't be publicly releasing my strategy tips like that, but feel free to thank me later if they work to your advantage. :-)

Wednesday, October 26, 2011

Book of War: Archery

Here's another callback to the "Basic D&D" discussions last spring, this time to the now-infamous "On Archery" blog, which generated a lot of really great comments (and is currently #4 on the list of all-time top read posts for this site). In some sense this issue was a little bit easier to deal with (somewhat fewer moving parts/coupling issues) than either cavalry or pikes. Questions posed at the end of that presentation (commentator consensus in bold):


  1. Would you consider using modifiers of -10/-20 or the like for man-to-man archery? Yes.
  2. Can we use the same modifiers indoors as outdoors (assuming that melee movement counteracts reduced range)? Yes.
  3. Should handheld missiles be without penalty? No.
  4. Should we totally forgo ranged modifiers in mass combat rules? Yes.
  5. Should creatures like giants get separate melee and ranged attack scores? No.


In general, I agreed with these opinions, and I was happy to build them into the way that Book of War works. For example: I suggested Question #5 as an option (giants should have crappier to-hits for missiles than melee), but that was decisively rejected, and it meant one less detail for me to include in BOW.

One other thing: At mass scale, Questions #1 (increase range penalties) and #4 (decrease range penalties) sort of cancel each other out. I actually tested Book of War with no ranged penalties at all, which is something that previously I argued for from a realism perspective, but it didn't make great gameplay in BOW. In particular, it made archers much too strong, or in other words, they had to be crazy-expensive to be balanced with other types (like: priced similar to cavalry), which itself is not terribly realistic. Therefore I decided to go with a middle ground like AD&D penalties of 0/-2/-5, which (divide by 3 and round to closest in this case), turns into 0/-1/-2 in the BOW d6 core mechanic. This made for some very nice play.


Other Archery Issues

To the right you'll see the Book of War missile-weapons table, including rate-of-fire (ROF), and maximum standard range in inches. Rate-of-fire is the number of dice one archer figure rolls in an attack -- when motionless; or, you can make a half-dice attack with up to a half-move. These are both similar to what you'll see for classic D&D (in particular for ROF, see text of Chainmail p. 11).

But here's an interesting observation about rate-of-fire: This does not imply or require that man-to-man rates of fire be the same. In fact, in my actual D&D games, I still run archery wherein bows get 1 shot per round, and crossbows 1/2. So why do they get effectively doubled at mass scale? Well, that's a function of the presumed internal formation of each 1:10 figure. Since the 10 men are usually arrayed in 2 ranks, a missile unit with everyone firing effectively gets twice as many attacks over the same time as a melee unit (with only a front of 5 men fighting at once). Or in other words: We've established that a single die-roll in BOW represents 15 attacks across 3 rounds of D&D (link). For a full figure of bowmen shooting over 3 rounds, that's 3×10=30 attacks, or 30/15 = 2 die-rolls in BOW. (Half that for crossbows.)

Conclusion: We might say that AD&D giving multiplied shots per turn for missiles (duplicating Chainmail) was actually in error, and the Basic D&D (Holmes/Moldvay/Mentzer) rule of just 1 shot per round for bows was, in this case, both more elegant and a more clear-headed model of what should be happening. (And as usual: Scaling issues are key.)

Another thing that I was compelled to consider: Mass archery at a distance is incapable of picking out individual opponents, and should have targets picked randomly (as per DMG p. 63, and plain common sense). So: Perhaps over the course of a round, so many arrows hit duplicate (already-killed) targets, that the overall effect is reduced to some degree? This prospect was investigated by simulation in the RPGBattle program. Fortunately, the "redundant shot" effect turns out to be significant in only one case -- if there are a great many archers, and very few potential targets, i.e., the victims are about to be all wiped out anyway. So it's an effect that is relievedly negligible for our purposes. See here for the spreadsheet analysis:



Something else that was quite popular in the earlier post comments (re: Question #5) was the idea of treating giants throwing stones as a kind of shattering/shrapnel area-of-effect weapon. Now, this was indeed what happened in Chainmail -- but it's ambiguous in OD&D, one of two options in Holmes, and then never again treated that way in any other edition of D&D (discussion here). So in light of this, and for balance and simplicity sake, I share the latter interpretation of Holmes -- giant stones are treated as a single-target weapon for 2d6 damage, and are treated by the system the same as any other missile-fire. (Rate-of-fire is 1/2 at man-to-man scale, and thus ROF 1 in BOW.)

Finally, one detail to keep in mind about the decision to use range modifiers that look like 0/-1/-2 in Book of War (analogous to AD&D mods): Note that the scale-switch from classic AD&D 1"=10 yards outdoors to BOW 1"=20' means that using the same range-in-inches (like 18" for a light crossbow) actually discounts the "real" distance for missile attacks by 2/3. As a result, the basic rule in BOW is that we really only deal with only two possibilities: under half range (no to-hit modifier) and over half range (at -1). For simplicity, in both gameplay effect and conversions using identical numbers, this was considered very desirable. The Optional Rules section contains the possible variant of extending out to long range (add another half-range, like a further 9" for crossbows), although at -2 to hit that makes for impossible shots in most cases anyway, so there's yet another reason to not worry about the missing extreme range.

And as a final final detail note this important item: Targets in heavy armor (plate, AH6 in BOW) are effectively immune to normal D&D missile attacks beyond short range (with -1 to hit, it's like AH7 versus d6). So keep that in mind for your tactical play.

Monday, October 24, 2011

Book of War: Pikes

Pikes are particularly nebulous in the D&D man-to-man rules. In OD&D (Sup-I) they are simply listed as being flat-out inappropriate for dungeon work; the effect of their reach has no rule specified until the AD&D DMG; and at no point are any rules given for the effect of mass pikes (except, perhaps, for Chainmail mass-combat, where they simply cannot be attacked by any non-pikes). We discussed that here as part of last spring's "Basic D&D" posts -- below you'll see the outstanding questions I asked at that time (with apparent commentator consensus in bold):


  1. How many ranks of pikemen can strike offensively, say 3? Yes.
  2. How many ranks of pikemen can strike defensively, say 3? Yes.
  3. How many ranks of pikemen can "set" for double damage, say 3? Yes.
  4. Do we allow an attack by pikes to "interrupt" the movement action of an opponent (even by a pikeman not individually the target of the attacker)? Yes.
  5. Can pike "interrupt attack" any number of attackers during a turn (versus some limited number, say, 1 as in 3E)? No.
  6. When used against charging cavalry, can the pikemen all opt to strike against the riders? (Or is it 50/50 riders/horses? Or more likely against the horses?) No (average response 75% to hit rider).
  7. Do pikemen get a "formation bonus" to hit defensively due to closely-packed spikes? No.
  8. Does a strike by a pike vs. an attacker end the attacker's move? Yes.
  9. Does a kill by a pike block other attackers moving through the same zone (either by piling up bodies or "skewering" upright)? No.
  10. Do we need to establish special rules to simulate the organization/formation requirements of properly using pike? Mixed.
  11. Do we permit heterogeneous formations (pike & halberd, pike & crossbow, pike & shot, etc.)? Yes.
  12. If a man drops the pike to use sidearm sword, can he later pick the pike back up? Yes.
  13. Do pikes cancel the cavalry rider AC bonus? Yes.


So clearly we all agree that about 3 ranks of pikes can strike (whether on attack or defense), that all of those can make an interrupting attack against enemy movement (obvious, but even that not explicated in D&D until 3E), and that pikes cancel the rider AC bonus from height (which syncs up with what we said for cavalry, here). Those are Questions #1, 2, 4, and 13 above, which are included in the Book of War rules. Ironically, if you dig into the details of the RPGBattle simulator here, you'll see that almost everything else got baked in exactly opposite to those responses. Ha! Let's see if I can explain why:


Problems With Pikes

The main problem (as discussed last spring) is that core D&D hit chances are really quite low, too low to provide much of a defensive benefit, in the absence of some other to-hit bonus from a massed pike wall (and absent it is from any classic D&D rules). Look at Questions #6-7 above; these are sticking pretty close to published D&D, with no formation bonus like I'm suggesting, and a significant chance to hit the horse instead of rider (note that that decision itself almost winds up negating/replacing the "pikes cancel AC bonus" decision in Question #13).

Consider a line of heavy cavalry charging into our massed pike wall, 3 ranks deep (all normal men). To-hit against AC2 is 17+ (4 pips). If there's a 25% chance to hit the horse (Question #7), then this is reduced to 3 pips in 20, i.e., probability 3/20 = 0.15. Hence the following is the chance for the rider to get through the thicket totally unscathed:

(1-0.15)^3 = 0.85^3 = 0.61 = 61%

So unfortunately, that doesn't serve the historical strategic purpose of pikes definitively holding off a cavalry charge from heavy cavalry (and in many sources, being almost totally invulnerable to such attacks; see also the Chainmail mass rules comment). Almost 2/3 of heavy cavalrymen can run through a pike thicket without a scratch! When I simulate this rule in RPGCombat, it turns out that the pikes actually suffer more casualties than the cavalry when they get charged. So I think we have to say that the existing (very sketchy) D&D man-to-man rules for pikes are simply insufficient for this project.

Of course, AD&D establishes the "set pikes do double damage to charging attackers" rule, but if you think about it, that rule does surprisingly little good for our purpose. The problem is really that the to-hit rate is too low to connect with most of the men charging through the pike wall. You could increase that damage multiplier to ×100 and still the majority of heavy cavalry will move through the pikes without taking any hits. And this non-relevancy is compounded by the fact that against normal men, a single hit usually kills them anyway, whether damage is single or double or anything else. (Note: The response to Question #3 -- all ranks can set against a charge -- seems counter to the rule on DMG p. 66, where the pike butt must be set on hard ground surface; to me that seems doable only by the front rank. But as we see here, the difference is of mostly academic interest anyway.)

Also, regarding Question #8 (yes, a defensive pike hit stops the enemy move forward), this is something that seems pretty reasonable if we picture the enemy as only a normal man. But we also need to deal with things like a charging horse, or in our fantasy game, things like lumbering ogres, giants, and dragons (far more likely to have inertia to snap the pike and keep coming?). So I'm very hesitant to establish that as a general rule (and it's not in D&D, and it's the same problem as Chainmail flatly disallowing anything to attack pikes -- okay for normal men, but doesn't scale to fantasy monsters).

And here's another counter-intuitive result that comes out of the simulator: the more you dial up the pike damage multiplier, the more casualties the pikes themselves take. This is because to whatever degree an attacker might take a hit and stop coming, yet survive and block the attackers behind him -- now turns into a downed attacker, and an opening for other attackers to keep charging (no further pike interrupt per Question #5; and no blockage from dead bodies per Question #9). That's a result that really boggled me when I first saw it.

One more thing: 3E gave a cover penalty to any reach weapons wielded from the back rows (+4 to AC; PHB p. 132), but that would immensely exacerbate these problems. (Like, for us, men in heavy armor would be totally un-hittable by anyone in the back ranks). Therefore that rule was never on the table, either.


Solutions in Book of War

So this was one of the few places in BOW where I was willing to switch significantly away from by-the-book D&D man-to-man rules (in that quite defensibly, it avoided mass pike issues in the first place) and come up with something that made our game play out reasonably well as a priority (which was the core of the crisis that I had last fall). The primary thing I did is to institute a "formation bonus" (Question #7), which is to say, a hefty "you can't dodge this" modifier when you get into a restricted space with a bunch of pike-shafts hemming you in on all sides, which I ended up setting at +4 bonus value for the mass pikes (or, +1 in BOW d6 space). This made it at least possible for pikes to hold off an charging enemy, even if it's still not a sure thing (it's possible in BOW for heavy cavalry to attack pikes, and with some semi-lucky die rolls, still manage to get through). Again, the to-hit level is really more critical than any damage multiplier.

Here's another thing I had to consider: We now have a whole litany of if-then situations we could theoretically apply for the pike attacks rule. Such as the following:
  1. One rank, or many ranks of pikes? (If a BOW player sets up one figure-rank of pikes, that's only 2-man ranks, so they don't get the full force of the pike benefit; two or more figure-ranks, however, is more than the 3 man-ranks we agree can all strike.)
  2. Infantry, or cavalry target? (Per Question #13 here, we all agree that the cavalry AC/target modifier needs to be washed out when the target is attacking cavalry.)
  3. Defensive, or attacking use of pikes? (The pikes could theoretically set for double damage in the first case but not the last; and also there's a difference in commitment to the opponent willing to run themselves through the pikes.)
  4. Rough, versus open terrain? (In our mass combat, we also want to reflect the decreased utility of pikes in irregular, rugged terrain.)
So if we directly handled all of that in BOW, there would be at least 4 if-then clauses to the pikes rule that you'd have to parse every time they were used -- equivalent to a 4-dimensional matrix with 2^4 = 16 different individual use-cases. (And if Question #10 had gone "yes", then that would have added a "5th dimension", which is then not even a game from TSR.) For playability and brevity I found that some of this stuff simply had to get cut out.

Now, IF we had gone directly with the results from the prior blog commentators' consensus (none of my alterations), then we could technically have had effects something like this: Pikes in full ranks against infantry get ×4 attacks in the first turn on defense, ×3 in later turns; against cavalry they get ×3 attacks in the first turn and ×2 in later turns.* Pikes in a single figure-rank get ×2 attacks versus infantry in the first turn on defense, ×1 versus infantry in later turns or against cavalry all of the time.

* Visualize: Most of the time, enemies are taking attacks from one lead guy pike-down with a sword, plus one further-back guy with the pike, for 2 attacks per turn. Add another attack per turn on average when someone moves through the pike field; and more on the first turn when that's definitely necessary. Cavalry dial it down a bit from the 25% of hits that would go against the horse instead. This always assumes a rational strategy of opponents hanging back beyond the pike field until there is a gap for them to advance into (i.e., never just hanging out in the pike field taking hits for no good reason).

No matter how we slice it, by D&D man-to-man rules, within the first turn an enemy will be able to get under the pikes and start delivering hits on the pikemen (the pike defense doesn't make the pike unit appreciable harder to hit in BOW scale). So in the interest of brevity, I've basically taken the extremes of the results above and cut the pike rules down to those: ×4 attacks on the defensive interrupt-attack, ×2 for same in a single figure-rank, or ×1 attack in any other situation. This specifically highlights the defensive mechanic of pikes; the model is, like, after the first turn, the enemy has gotten in "under" the pikes, there is a swordfight at the front rank while back pikes are not really usable, and meanwhile further ranks of the enemy hang back out of range until an opening appears. Pikes get the +4 defensive formation to-hit (+1 in BOW) versus everyone, and an additional +2 to hit (also rounded up to +1 in BOW) versus cavalry to reflect their lost AC bonus. Those modifiers are sufficient to score at least some hits against even charging heavy cavalry, and then by giving a morale check in that situation, there is some good chance to turn aside the attack; and the mechanic deals smoothly with huge monsters like giants and dragons (unlike the Chainmail mass rule that simply prohibits any attacks on pikes).

The final published rule for pikes in Book of War winds up looking like this:
Pikemen: Footmen with long pikes have a special defensive advantage: when an enemy first makes contact from the front, the pikes get an immediate free attack. This attack is at double dice in a single rank, or quadruple dice in multiple ranks; with an attack bonus of +1 vs. infantry, +2 vs. cavalry. The enemy checks morale immediately, and if failed, gets no attack. Pikes lose this benefit in any non-open terrain, or when routed. Pikes can also attack enemies up to 1" away, without making direct contact. [BOW, p. 7]

The additional things you see here are a loss of the defensive benefit on non-open terrain or when routed (to give the flavor of pike organization issues and the historical cases of losing when pushed into bad terrain, without creating any brand-new mechanics just for this purpose; see Question #10). And the 1" attack range at the end is both realistic, and a nice way of signaling whether the "interrupt strike" has occurred yet (in synch with our desire to have all information visible directly from the figures on the table; if pike figures are in direct contact with an enemy, then it's past the first turn and the interrupt strike is over). Tactically what usually happens is that pikes get 1" away and make an initial, single attack (from the front man-ranks only), giving the enemy the opportunity to flee away from that challenge. However, if on the next turn the enemy presses the attack, then they'll suffer the whole ×4 attack routine with bonuses, if so committed to that course.

That's as short, concise, and playable as I could make that rule -- I went around with numerous different formulations last fall and got feedback from my primary testers before settling on it. (I think somewhere I have a text file with about a score of different possible permutations.) I've found that it works out extremely well in play, and it even has some nifty "emergent behavior" that winds up playing in ways different than what you might expect at first blush (although I've given some hints in the paragraph above). More on that a little later.

If you want to see it, below you'll find a version of the RGPBattle program with the blog commentators' consensus options all switched in, and a spreadsheet of results for pike attacks in that scenario:

Friday, October 21, 2011

Friday Night Book of War

As I think I've mentioned previously, my girlfriend I now have a regular Friday "date night", which involves ordering Chinese food, watching a TV show on Hulu, and then playing Book of War. It's actually our favorite game at this point, which is rather surprising because: (1) I was never a wargamer as a younger person (although the idea was vaguely attractive to me), and (2) she was outright traumatized by her older brother demanding that she play various complex Napoleonic wargames when she was a girl. Here's a game from last Friday that just happens to involve a lot of cavalry of different types:


Start -- We agreed to play by Basic Rules only, at 200 points each (the minimum we play with; game takes about 1 hour). Opposition at the top has selected light infantry, heavy cavalry, and heavy crossbows; at the bottom, I've selected a more lightweight force of light cavalry, light infantry, and horse archers. We play on a smallish table, 3×3 feet, which generates a total of 4 rolls for terrain. Surprisingly, the only terrain that showed up was one section of woods; this probably benefits my army. Opposition has initiative and will move first.



Turn 1 -- In this turn, opposition moved forward, with crossbows taking a half-move to shoot at my horse archers, scoring 1 hit (white die shows damage; another hit will remove a mounted figure). I've charged forward on the left, and managed to rout her light infantry there immediately. I've also sent light cavalry into the woods (an unusual and risky move), while my horse archers on the right shot at her central light infantry, routing those as well. (Note that shooting at heavy cavalry at this distance was a non-option, since I can't penetrate their AH 6 armor beyond short range.) That's a good turn for me.



Turn 2 -- Opposition heavy cavalry now charge my infantry in the center, immediately killing 5 of the figures (50 men). On the morale dice, I need at least 9+ in order to avoid routing -- which I succeed at (shown in picture)! This is fortunate for me, since it means her heavy cavalry will be hung up for at least one more turn dealing with those infantry.



Turn 3 -- A turn-and-a-half later, and I've first charged my horse archers into melee (versus both crossbows & heavy cavalry), followed by my light cavalry and infantry units from the woods (with light cavalry finding the rear of the enemy for +1 to hit). The crossbows have been totally destroyed, but I've only managed to score 1 hit against the heavy cavalry. In response, my already-damaged light infantry have now been routed from the table (see tag at bottom edge), I've lost a horse archer figure from each unit, and her 2nd unit of heavy cavalry is now in the fight. Opposition calls this action in the center "a big pukey mess".



Turn 5 -- In the prior turn, I managed to maneuver one of my light cavalry units into the center melee, scoring another hit on the heavy cavalry (one figure lost), and forcing a morale check. This was disastrous for the opposition -- she could pass the morale check on any 6+ (2d6), but only rolled a 5! Thus, you see the heavy cavalry unit running away from the action. Meanwhile, I've lost the remaining horse archers and most of my infantry in the center. In subsequent turns, my light cavalry will pursue the routing heavies to the edge of the table, getting rear attacks each time.



Turn 7 -- More crazy good luck for me and French-idiom curses from the opposition. Her remaining heavy cavalry (yellow) is scoring hits on me, but in the picture you see my morale checks for the turn (all 6's!). In particular, those two infantry figures had been routed, but now they'll get to turn around and re-join the fray.



Turn 8 -- I'm trying to beat down those remaining heavy cavalry, but it's never easy; again, only die-rolls of 6 score hits (except for rear attacks). Here I manage to eliminate one figure, although her morale is still good. On the next turn she'll rout the near unit of light cavalry (green) off the table, but I have another unit circling around the woods after finishing off the heavy cavalry that routed earlier.



Turn 10 -- The End. Here I needed one more hit to finish her off, and although my attack roll was not great (pictured; almost all 1's and 2's), there is a single 6 in there that does the job. Victory! This was actually a more lopsided game than most, definitely owing to several unlikely morale checks that all went in my favor (my light infantry saving, and her heavy cavalry routing); plus, the terrain was to my advantage, and I didn't make any egregious mistakes. I wouldn't expect to beat heavy cavalry quite so easily a second time, but I'll take my wins where I can get them.

Wednesday, October 19, 2011

Book of War: Cavalry

Once you get past the Book of War Core Rules, then there's a "Basic Rules" section which covers historical types of normal men, terrain setup, basic formation and movement issues, and a simple morale mechanic new to the game. To some extent this is an homage to how Chainmail was set up, but more importantly it's just a really good way to introduce the game in a way that's both short and coherent (both thematically and mechanically).

Let's talk about how cavalry work. Last winter I asked some questions here about the preferred adjudications for some of the parts of D&D which are ambiguous in this respect. Here are those questions again, and the apparent consensus that I could see from the comments section (in bold):


  1. Should the modifier for mounted-vs-foot be doubled (+1 to +2) if we use it in the context of D&D? Mixed.
  2. Can any of the following ignore the rider AC bonus: (a) footmen with polearms, (b) archers, (c) giants? Yes.
  3. If a footman's attack misses because of the mounted modifier, does it hit the horse? No.
  4. Can men opt to intentionally attack the horse instead of the rider (and is there any symmetric modifier or chance to hit the rider)? Yes.
  5. Can unhorsing be accomplished with any weapon type in OD&D? Yes.
  6. Should there be some radical change to how charge attack to-hits are adjudicated (i.e., no lance exceptionalism, use horse attack level, speed indicator, re: to-hit and damage)? Yes.
  7. How many attacks per round should horses be given -- just one? Yes.
  8. Should horses continue to be barred from any attack in the charge round? No.
  9. Should there be an "overrun" capacity in which cavalry can move/attack/move (and possibly more) within a single charge round? Yes.
  10. What level of AC should barded horses be given? Mixed.
  11. Should riders be positioned at the front or rear of the horse (i.e., can they sword-attack an enemy in front of the horse)? Varies.
  12. Do we use the "rider stun" chart from Chainmail? What if the horse is dropped in a non-intentional-unseating attack? Yes.
  13. Do warhorses attack on their own if the rider is killed or unhorsed? Do they run from the line of battle, or stand motionless? Varies (suggest morale checks).
  14. Should we use a +4 to-hit bonus for charging cavalry (doubled from Chainmail's cavalry-first-turn-bonus, p. 25), and a separate +2 bonus for anyone else charging (as per AD&D DMG p. 66)? Varies.


Fortunately, I also agreed with all of those definite answers above -- it was already being baked into the game that way, so it was nice to have additional confirmation that those were reasonable rulings. Here are some more observations:


On Cavalry Attacks

First of all, it's a complete non-starter for me to give mounted horses in combat something like 3 attacks per round. The problems with that are legion: (1) it's complicated and fiddly to roll all those dice for fairly small damage amounts, (2) it really seems unrealistic that a horse can strike with all those weapons all at the same time while the rider also strikes *, (3) it actually makes the cavalry charge less damaging than sustained combat, really missing the whole point to cavalry on the battlefield, etc., etc. For reasons like these, I'm deliriously happy to stick to OD&D with its one-attack-per-round system (and reiterated for mounts in Swords & Spells p. 18), avoiding the whole Greyhawk/AD&D attacks system in this regard.

(* Growing up on a farm, I've been knocked on my ass by horses & oxen several times. But every time it was from a single hoof shooting out in a quick, solid blow. Getting kicked twice & bit all at the same time just couldn't happen.)

So what occurs in BOW is that a rider/mount unit gets 2 attacks per turn (1d6 damage each), effectively one from the rider and one from the horse. For the horse, maybe that's one kick, or an overrun/trample type attack. You also get this same "overrun" attack from the horse even in the first round/turn of combat, which simplifies things, and basically splits the difference between the more-significant charge attack (historical?) and the more-significant sustained combat (AD&D). However, in any bad terrain (woods, hill, swamp, etc.) we assume that the horse doesn't have its footing to accomplish this, and cavalry attacks are thus halved to just 1 per turn for the unit. (See Questions #6-9 above.)


On Cavalry Defense

The defense capabilities of cavalry are a little more complicated to analyze. In short, what BOW does is give every normal man/horse combination 2 hit dice. More generally, most riders on horses double their hit dice; or for mounts that are naturally aggressive -- like wolves or dragons -- you'd add the rider & mount's hit dice. Below we'll consider some issues that led to this. (For the complications in D&D regarding cavalry defense, see survey questions #1-5, 10, 12, and 13 above.)

For Question #1 (appropriate rider AC bonus; Chainmail gives +1 on p. 26), we've said previously that Chainmail bonuses should be doubled for D&D play. Indeed, a D&D +1 bonus would be too small to make any difference at the BOW d6 scale (we usually divide modifiers by 3 and round down), although I'm willing to round up from +2 in certain special cases. I ran simulations in RPGBattle of possibly using a +2, +4, or +6 rider AC bonus; one problem that came out of that is the heavy cavalry types (in plate & shield, AC2) would become totally invulnerable to normal men at anything over +2 (like, +4 bonus = AC -2), so I decided to assume a +2 AC rider bonus.

But what does that indicate, really? For standard melee types, it's a +2 AC bonus for reach/cover from the horse itself, while trying to strike the rider. (Note, however, that we do not assume resulting misses effectively strike the horse -- Question #3 came out "no" -- and similarly, the original Chainmail rule doesn't have any comment or method for handling hits on horses.) However, for missile attacks falling from above, we don't assume that cover applies, but instead that we're rolling a randomized 50/50% chance to target either the rider or the horse (similar to missile discharge in DMG p. 63). Fortunately, either mechanic is approximately the same: Say, for the heavy cavalryman (AC2), normal men have 4 pips to hit on d20 (to-hit 17+), so reducing that by 2 points is the same as a 50% hit reduction. For other armors it's skewed differently, but still approximately correct, so we're comfortable assuming this for either type of attack.

One alternative we considered is to use the +2 AC bonus directly (rounded up, so +1 AH in BOW), but the problem with that is that it makes heavy cavalry AH 7 (i.e., D&D to-hit 19, about one-half chance in d6). If we did that, then we'd need a mechanic that says something like "AH 7 means a hit roll of 6, followed by a 2nd confirm roll of 4-6", which is something they actually do in Warhammer. But (a) my playtesters didn't like that, (b) the extra-die rolls seem complicated, clunky, and inelegant, and (c) the mechanic doesn't scale smoothly to ultra-high ACs you'll have to deal with for high-level heroes. So that was rejected in favor of the 50/50 hit assumption, i.e., cavalry act as though they have 2 Hit Dice, which is statistically about the same anyway.

On the other side of the equation, you have to ask: What's the best strategy for people attacking cavalry; should they target attacks first at the rider, or the horse? One assumption we make is that if the rider goes down, then the horse either stands nearby quietly or runs off (i.e., doesn't actually press the assault while riderless). So the question is really what eliminates the rider faster; perhaps killing the horse first and removing the +2 AC bonus (or whatever) is the better strategy? So that's a strategy question I ran though RPGBattle -- and as might be expected with a 2HD or 3HD horse in D&D, it turns out not to be the case. See the spreadsheet below for results of that analysis; even if we engage both the Chainmail rider-throw table (stun thrown riders for 0-3 rounds; p. 26) and AD&D stunned-to-hit bonus (+4), it's still always better to attack the rider directly than to kill the horse first. ** (And therefore, those latter mechanics [available by switches in the RPGBattle program], and any option to attack the horse [Question #4 above], turn out to be of academic interest only.) The only time when that wouldn't be the case is if we dialed up the rider AC bonus so high (+4 or +6) that the rider is actually immune to attacks entirely. Fortunately, that's also in synch with Chainmail, which as we said before has no comment on any option other than attacking the rider in a cavalry-melee situation (p. 26), bolstering our model here as true to the core D&D system.


(** Now, my dad, who's a large-animal veterinarian, will watch a Western chase scene with me and claim that "it would be much easier to stop that guy by shooting at the horse's leg than the rider", but I'm not so sure about that, since it seems like a really small target. But I wanted to consider it for D&D nonetheless.)

So that's where our BOW model of cavalry hit dice comes from; per D&D, it's really about twice as hard to hit the rider, and therefore effectively takes twice as many good hits to eliminate the unit. Two other details: (1) Barding, following Chainmail, will be assumed to be of only one type that gives +2 AC to the horse, in use for heavy cavalry (see Question #10); therefore horses are all within 2 points of the rider AC, and we say that's "approximately equal" for our purposes at mass scale (and even if doubled to +4, it would be an irrelevant change anyway). (2) Pikes will be given a D&D +2 to hit cavalry (+1 in BOW) to reflect the fact that their long weapons negate the reach/cover bonus that riders normally get from melee-types (see Question #2).

Compare what we've done here to other D&D-branded wargames: Gygax's Swords & Spells directs you to add up all of the hit points of riders & mounts [p. 17], and to defeat them, you've got to degrade all of those summed hit points; personally, I think that's a grave mistake, as you clearly don't need to murder every single limb and cell of a cavalry unit in order to defeat it (again, if you knock off all the riders, the horses -- the majority of the summed hit points, after all -- aren't going to keep leading the attack against you). From my youthful play experience, cavalry in Swords & Spells took enormous, crazy amounts of damage before you could defeat them. In contrast, Doug Niles' Battlesystem instead gives a rule of "average the rider's HD and the mount's HD; round up" [1st Edition p. 19; 2nd Edition p. 105]. That's a distinct improvement, but I think it still runs into problems if you have a very large, non-aggressive mount (like an elephant, perhaps) and a normal man riding it; really, all you have to do is kill the 1 HD man to defeat the unit, and it seems like an overinflated bonus to give him half the mount's HD as a modifier. So that's why I prefer the rule in my Book of War: double the rider's HD for non-aggressive mounts (bonus capped by the horse's HD; reflecting a +2AC or 50% targeting miss chance), or sum the rider & mount HD only in cases of naturally aggressive, monstrous-type mounts. I'm hoping that you'll agree.

Tuesday, October 18, 2011

Book of War Review: La Marca del Este

The first review for Book of War that I can confirm went up late last week at the excellent Spanish-language RPG site, Aventuras en la Marca del Este. Very kind things to say (via Google translate):
[A] great little extra... extremely interesting indeed... elegant... In short, a great contribution, fantastic, compatible with Adventures in the East Mark and highly recommended.
One of the very satisfying observations here is how compatible the work is across a lot of different D&D-inspired game systems, including their own system over there at Aventuras en la Marca del Este. The more, the merrier!

Monday, October 17, 2011

Book of War Rules Justification Part 3

Previously I've shown two different ways of confirming the core Book of War combat mechanic (one, two) -- let's try a third and final way, just to be sure.

This actually came about approximately one year ago. I'd been tooling on the game for some time (years), and last fall I had something of a panic attack/realization that the special types of cavalry, pikes, and archers might not be functioning as I expected in the underlying game. So I took a few months and dug into the RPG-level details of each, and resolved to write a larger simulator program that actually throws large units of D&D guys at each other, tracking the details of each. Here it is (Java in a ZIP file; released under GPL v.2):


The prior program (link "one" above) was a short little thing that, for brevity, only looked at the front file of 5 guys on one side and filled in gaps from "somewhere" as needed (also, it only handled men wielding a normal hand weapon). This new one's a bit more extensive: 14 files and about 1,700 lines. It will track all the individual men involved, move them forward as needed, separately handle mount & rider stats/attacks, let you swap in different AI to compare best strategies, allow men to switch weapons during the fight (needed for cavalry, pikes, archers), etc. By default, it runs 1,000 separate battles and outputs total individual men killed per round, which we can then load into a spreadsheet for mass-level analysis. (There's no GUI, so as usual, you need a Java development environment if you want to tweak or double-check my work).

For example, here's what you get if you just run big units of swordsmen at each other; once again, we see the 4/5/6 target on a d6 for armor types of leather/chain/plate (.xls spreadsheet here):


My point here isn't to be needlessly repetitive. The important thing is that we can use this as a platform to test out the more exotic troop types (pikes, cavalry, archers) and the various interpretations that are necessary for those parts of classic D&D combat that are left ambiguous or undefined (and: to explicate my assumptions in this regard). For example: The effect of mass pikes on closing & sustained combat; the best strategy for attacking a mount/rider, or approaching a unit of pikes; possibly throwing & stunning riders from a downed mount (as in Chainmail); etc., etc. This ties back to some of the "Basic D&D" questions I was posting here last February/March; more on that next time.

Sunday, October 16, 2011

On Talking Concretely About Games

Which is to say: without relying on that hollow word, "fun". From a great anti-"gamification" article by Ian Bogost at Gamasutra earlier this year, a passage that I found particularly useful:
... key game mechanics are the operational parts of games that produce an experience of interest, enlightenment, terror, fascination, hope, or any number of other sensations.
(The fun-word-fails-us manifesto, here.)

Friday, October 14, 2011

Book of War Figure Zoom-In

When you see a Book of War figure that looks like this:


Sometimes it helps to keep in mind that it really represents this:


That is: 10 individuals at man-to-man scale, arranged in 2 ranks of 5 files each. Grid spaces in the picture above are 3 feet each (or 3⅓ feet per DMG p. 10, if you prefer). Total length on one side of the formation is 5 spaces × 3 feet/space = 15 feet; i.e., the same as one figure in BOW scale (top picture), ¾ inch base × 20 feet/inch scale = 15 feet.

Wednesday, October 12, 2011

Early Scaling

As I was forced to dig deep into the Moldvay Basic D&D (1981) rules for the prior blog post, I was surprised to find this at the very end of the book (in Moldvay's 2-page "Dungeon Mastering as a Fine Art" section; literally the last thing in the book before the Credits):
PLAYING SURFACE: Combats are easy to keep track of when large sheets of graph paper, covered with plexiglass or transparent adhesive plastic (contact paper), are used to put the figures on. The best sheets for this use have 1" squares, and the scale of 1" = 5' should be used when moving the figures. [p. B61]
Is this the earliest use in any official D&D rules of the 1" = 5' scale? Both Holmes (earlier; p. 9) and even Mentzer (later; Player's Manual p. 57) still suggest using the 1" = 10 feet scale for miniature play.

I must say (having played by these rules quite a bit back in the day, but not having looked at them in many years) that I'm quite impressed by Moldvay's grasp of the math behind the game. If only he hadn't instituted race-as-class...

Monday, October 10, 2011

On Expected Treasure and XP

Tom Moldvay wrote in his D&D Basic Rulebook (1981):
... most of the experience the characters will get will be from treasure (usually 3/4 or more) [p. B45] *
I think that many old-school players take this as a statement of original design intent regarding the old D&D treasure types and experience system. I'm going to disagree with that, and claim: (a) this statement by Moldvay is descriptive and not normative, and (b) while it's accurate for Moldvay's B/X rules, it does not match other editions of D&D. (Note: All discussion below is in terms of by-the-book D&D gold-standard economy.)


Arneson's OD&D

Let's look at OD&D first. Below you'll see a table of all the hostile monster types (those appearing in dungeons) from Vol-2, p. 3. Each has its standard number appearing, experience point value (per Sup-I, Greyhawk), and expected value from its treasure type (including the requirement that the in-Lair % chance be rolled; as stated on Vol-2, p. 23). Then a ratio for expected value of the treasure versus monster XP is made. (Download full .xls spreadsheet here.)


The end result: Over all of these monster types, there is a GP:XP ratio of 1.5; that is, only about 3:2 in favor of the treasure XP. A clear majority of monsters actually give more XP from the monster than the treasure (about 20 of 30). Note that there are two extraordinary outliers: Dragons (ratio 8:1) and Medusae (ratio 23:1!); if you remove these two outliers from the list, then the overall ratio dips to just 0.8 (i.e., actually less treasure than monster XP). Another way of looking at this, perhaps -- roughly 40% of all the available treasure in the game comes from Dragons, and until the PCs are high enough level to be hunting dragons, their XP will mostly not be coming from treasure. (If played purely according to these random charts.)

Side observation -- The majority of most treasure value comes solely from the Jewelry component. By my calculations, almost all of the OD&D treasure types have between 55% to 85% of their average value coming from Jewelry (average 70%; with outlier Type G, a low 20% of its value from jewelry). Or in other words: If you miss the Jewelry component roll for a treasure type, then you've missed about 2/3 of the nominal value of that treasure type, on average. Or again: Making the Jewelry roll approximately triples the total value from any treasure type.

Other note -- You might look at the XP example of the troll in Vol-1 ("7,000 G.P. + 700 for killing the troll = 7,700" [p. 18]) and say, "hey, that's evidence that OD&D gives about 10% XP for monsters". Except that the example is doubly impossible from the listed monster/treasure tables: (a) trolls number appearing is 2-12 (1 being impossible), and (b) troll treasure type D has at best 1-6 thousand gold pieces (7,000 being impossible). According to my stats, the average result would be to get 7 trolls for 4,550 XP (7×650 per Greyhawk) and a total 3,743 gp value, i.e., as we're saying, expect more XP for the trolls than the treasure. (Also: This example refers to trolls as being "7th level", whereas the monster levels in Vol-3 only go up 6th, so the example is pretty disconnected from the rest of the rules.) Keep in mind that if we used pre-Greyhawk XP (HD×100), then things would be even more skewed in favor of the monster XP.


Moldvay's B/X

Let's try this again for Moldvay's B/X rules. Now, two huge changes will occur at this point. One: The large-scale numbers appearing have been dramatically reduced for the numerous humanoid types (usually dividing by about ten; e.g., men/bandits from 30-300 in OD&D to 3-30 in B/X, etc.). Two: The Lair % statistic has been entirely removed, so presumably any time the larger number of creatures appears, they get their full treasure type valuation. See the results of that below (or spreadsheet for this here):


Now: Over the same core hostile monster types, the GP:XP ratio is close to 3; i.e., a 3:1 relationship of treasure to monster XP -- or in other words, exactly the "usually 3/4" from treasure as Moldvay asserted (see quote at top). Most monster types (about 25 of 30) do indeed give more XP from their treasure than from the monster. While dragons and medusae still have excellent GP:XP ratios, now the far and away outlier is actually Men, with an astounding 108:1 ratio in favor of their treasure! (Analysis: Type A is an excellent type of treasure;the number appearing was divided by 10 from OD&D; and the frequency of treasure was multiplied 6-fold by dropping the low 15% Lair chance.)

Side observation -- Moldvay presents a list of "average values (in gold pieces) of each treasure type" [p. B45], and Moldvay's averages are extremely accurate. (They match very nicely to my numbers in the linked spreadsheet.)

Other notes -- In general, the following are all copied directly from OD&D: (a) Monster treasure types. (b) Treasure type contents (with the addition of new electrum & platinum categories). (c) Monster numbers appearing, for types other than the multitudinous humanoids. (d) The XP values for monsters. (e) Dungeon unguarded treasure tables. However, gem and jewelry values have distinctly dropped by abandoning parts of the generation procedure (gems in a batch "increasing value", and jewelry high-end exceptional rolls).


Treasure in the Dungeon

Now, the preceding was based just on looking at the core "numbers appearing" and "treasure types" from OD&D, which are generally supposed to be just for wilderness encounters. We might ask the obvious question of what's supposed to be the case in the dungeon, but the situation there is enormously more murky. Unfortunately, all of the classic versions of D&D leave this issue almost entirely unspecified and in the realm of pure DM fiat.
  • OD&D states multiple times that monster numbers should be scaled to size of the PC party (Vol-2, p. 4; and Vol-3, p. 11). The listed numbers should be "primarily only for out-door encounters" (Vol-2, p. 4); and treasure types are only applicable to "those cases where the encounter takes place in the 'Lair'" (Vol-2, p. 23). In the dungeon, all we are given is that creature numbers are "modified by type" (Vol-3, p. 11; more discussion here). Dungeon treasure might possibly be generated on level-based tables (Vol-3, p. 7), although in later editions of D&D, those tables are generally indicated as for unguarded treasure only. (Note: If used for that purpose, then the average GP:XP is even lower, 0.7 by my calculations, i.e., about the same as the treasure types sans dragons: spreadsheet here.)

  • Holmes D&D keeps the same treasure types; he removes all of the numbers appearing in the monster entries (esp., all of the hundreds of humanoids); but he adds specific numbers for the dungeon wandering monster tables (usually on the order of 1d6 or so). But as far as dungeon treasure goes, he gives a short nod to OD&D and then punts to another product entirely:
    The TREASURE TYPES TABLE (shown hereafter) is recommended for use only when there are exceptionally large numbers of low level monsters guarding them, or if the monsters are of exceptional strength (such as dragons). A good guide to the amount of treasure any given monster should be guarding is given in the MONSTER & TREASURE ASSORTMENTS (available from TSR or your retailer). [Holmes D&D, p. 22]

  • Moldvay's B/X still maintains the same treasure types; and he merges the numbers appearing into two high/low categories (but again: dividing the truly large numbers by about 10). He says that the higher numbers are for when "met in in the monster's lair (home) or in the wilderness", and regarding treasure types, "in general, treasure is usually found in a monster's lair (home)" [p. B30]. This linkage is reiterated again later:
    Treasures A through O are large, and generally only for use when large numbers or fairly difficult monsters are encountered. The lairs of most human-like monsters contain at least the number of creatures given as the wilderness "No. Appearing" (the number in parentheses). [p. B45]

  • However, having said that twice so far (that full Treasure Types are for large, lair-wilderness numbers only), Moldvay then contradicts this with his dungeon stocking procedure. Having rolled a small, random dungeon-wandering sized encounter, he says, "If treasure is in a room with a monster, use the Treasure Type for that monster (given in the monster description) to find the treasure in the room.)" [p. B52] Zounds!

  • Frank Mentzer's DM's Rulebook basically copies the Moldvay language on treasure types. "When the Treasure Type is a letter from A to O, that should only be the treasure found in a full lair (the Wilderness No. Appearing -- the number in parentheses in the monster description)" [p. 40]. However, his dungeon-stocking procedure apparently switches back to the OD&D rule -- it deletes any mention of monster Treasure Types, and instead references the same short level-based random treasure table: "The amount of treasure can be determined by using the random Treasures Table..." [p. 47] (I guess I would consider this a proper fix to the overly-generous and contradictory Moldvay rule.)

So we see that in most versions of D&D, the preponderance of the evidence is that Treasure Types are actually not to be used for standard dungeon-based small numbers of monsters, but only for large wilderness-equivalent numbers in the "lair". Which is a rather significant misstep, based on our standard dungeon-centric use case. But that data is the best we have for expected XP ratio from treasure/monsters -- and as we've seen for OD&D, if we use the dungeon level-based treasure tables, then the ratio is even lower (more from monsters than treasure). In neither case does it seem like this was an advance design consideration.

(Notice that I haven't worked out AD&D numbers for this discussion: it would be quite a bit harder, since in that work Gygax switched from one-letter-type-per-monster to a mixture of several different combined letters per monster. That said, I'm assuming that the ratios are about the same as in OD&D, since the numbers appearing, in-lair %, etc., are generally copied directly from that work. With the possibly large wild card of awarding XP for usage of magic items.)


Conclusions

Here are some conclusions that I would offer, based on this evidence:
  • Arneson probably didn't plan out any statistics like this in advance for the original system. And probably Gygax never actually used random treasure tables at all in his games. (I'd say they're both notorious for not actually using the published rules; and the vagueness of dungeon numbers and treasure speaks to the lack of any specific system for that in the first place.)

  • Moldvay, however, shows an exquisite awareness of the average results produced by the treasure table system (as evidenced by his correct 3/4 ratio statement; and listing the correct average values for each treasure type, unique to his rules). That said, this could not have been an advance design decision, because he simply copied all the legacy types and valuations from OD&D (and does an across-the-board deletion of Lair %, and reduces the larger numbers appearing).

Let's accept that the D&D treasure and experience amounts were not initially designed with any particular ratio of XP from treasure versus monsters. But let's say that you want that, to promote certain desirable types of gameplay (such as rewarding treasure-acquisition from stealth and trickery, for example). Then you might select from one of the following possible options: (a) Follow Moldvay in deleting in-Lair % checks, and dividing humanoid lair numbers by about 10. (b) Ignore the in-lair dictums for treasure types entirely, and award the whole Treasure Type even for small numbers like 1-6 orcs. (c) Boost the XP value from treasure, perhaps awarding 10 XP per GP, or something like that (also accelerating advancement). (d) Shift all of the XP away from monster-killing, adding the same value to their treasure-acquisition awards (if that's what you want to promote, might as well go whole-hog, eh?)


* Thanks to Tavis and Kipper at the ODD boards for reminding me where this statement came from.

And additional thanks to UWS Guy and DHBoggs for informing me in the comments that the OD&D treasure type system was the work of Dave Arneson.

(Photo by Falashad, under CC2.)

Friday, October 7, 2011

Testing Balanced Dice Power

The point of this not too far-fetched scenario is that chi-square is a test of rather low power; its ability to reject the null hypothesis, even when the null hypothesis is patently false, is quite weak. And the smaller the size of the sample, the weaker it is. -- Richard Lowry, Vassar College
One of the things that's gotten a lot of interest on this blog is my presentation of how to test for fair (balanced) dice -- a statistical application of the well-known Pearson's chi-square test. (See prior posts on the subject here, here, and here.) One of the things I said about the test, early in the first post was this:
It has a significance level of 5%; that is, there's a 5% chance for a die that's actually perfectly balanced to fail this test (Type I error). There's also some chance for a crooked die to accidentally pass the test, but that probability is a sliding function of how crooked the die is (Type II error). A graph could be shown for that possibility, but I've omitted it here (usually referred to as the "power curve" for the test).
And when I said, "a graph could be shown for that possibility, but I've omitted it here", that was, of course, code-speak for "I have no f*ing idea how to compute that or what it would look like". At least one person later expressed interest in seeing it, so at that point my goose was cooked, so to speak. (Thank you very much, Mr. JohnF.)

Therefore, what I did recently was sit down and write a short Java program to simulate the appropriate power-test results by random simulation, and I'll present them below. This investigation was quite instructional to me personally, because it was a significant step outside my comfort zone, and not something that I could find explicitly done anywhere online or in any textbook I could access.

Let me first explain some testing terminology, so that we can be careful with it. In statistical hypothesis testing, there is defined a "null hypothesis" (nothing is changed from normal), and a competing "alternative hypothesis" (something is changed from normal). Usually we, the experimenter, are in some way rooting for the alternative hypothesis (as in: this drug makes sick people recover faster, so now we can build a manufacturing plant and start selling it). To be safe, hypothesis tests are therefore set up with a very high burden of proof for the alternative hypothesis. The end result is technically one of either "reject the null hypothesis" or "do not reject the null hypothesis" -- and without extraordinary evidence to the contrary, we "do not reject the null hypothesis" (i.e., assume nothing has changed by default: compare to other notions like burden of proof and Occam's razor).

Mathematically for us, the null hypothesis will be a specific fixed number (probability distribution), and the alternative hypothesis will be that something varies from that expected number. For dice-testing, therefore, the null hypothesis is actually that the die is perfectly balanced (no face different than the others; e.g., 1/6 chance each for a d6). The alternative hypothesis is that the die is malformed in some way (i.e., at least one face with an altered chance of appearing). So based on what I just said above, if the test says that the die is unbalanced (reject the null hypothesis), then you can pretty much take that to the bank. But if the test fails to say that -- then we've got an open question as to what, exactly, that tells us. (Hence, this investigation.)

Here are three important terms in a hypothesis test: n, α (alpha), and β (beta). The value n is the sample size; how many times we roll the die for our test (previously I'd said the test is justified for a minimum of n=5 times the faces on the die; i.e., 30 rolls for a d6, 100 rolls for a d20). Value α is the chance of a false positive (Type I error; rejecting the null hypothesis when it's true; apparently getting evidence of an unfair die when it's actually balanced; also called the "significance level"). Value β is the chance of a false negative (Type II error; non-rejection of the null hypothesis when it's false; finding no evidence of an unfair die when it's actually unbalanced; also 1 - "power level"). More on these error types here.

Going into the test, you can pick any 2 of the 3 (the last term is logically determined by the others). Obviously, we would like both α and β to be as low as possible, but neither can be zero. A higher sample size n, of course, always helps us. But for a fixed sample size n lowering α increases β and vice-versa (it is, therefore, a balancing act). In practice, you usually set n to whatever size you can best achieve (time and grant-money permitting), and α to the industry-standard of 5%.

In theory you could solve for the resulting β value -- except that to do so would require perfect knowledge of the balance of the die you're testing -- and of course, that's what you're trying to determine in the first place with the hypothesis test.

So: here's what you'll be getting below. Assume that your die has a single odd face that is biased in some way (different probability than the others: I'll call this special probability P0), and that the other faces all have equal probability from what's left. We'll make a graph for every possible value of P0 (on the x-axis), and compare it to the simulated value of 1-β (so that higher is better, on the y-axis), and see what that looks like. This is called the "power curve" for the test; it's an important analysis, but usually glossed over in introductory statistics courses.

(Side note: Is the "one odd face" model realistic? Probably not: if you shave down one edge, then you'll change the likelihood of at least two faces appearing. If one face appears less, then the opposite face should come up more. But at least this model gives us an impression of the test's power.)

This is accomplished by the following Java program (GPL v.2 license). The program takes a certain type of die and fixed levels for n and α, and outputs a bunch of (x,y) values, where x = P0 and y = Power of the test for that odd-face-probability value. These values I copy into a spreadsheet program and then generate a chart from the results. (The program only makes one table at a time; to change die-sides, n, α, or anything else, you've got to manually edit & recompile).


Below are the results for a d6, across several increasing values of n (number of rolls we might perform). Or click here for a PDF with some additional charts:

This shape is basically what we expect from a "power curve" chart: something of a "V" shape, with the bottom-point at the value of an actual balanced face (here, 1/6 = 0.17). The y-axis shows the power of the test: the probability of rejecting the null hypothesis in the test (i.e., a finding for the alternative: that the die is unbalanced). It's more likely for this to happen the more skewed the die is (further left or right). It's less likely for this to happen if the die is minimally skewed or actually balanced (near the center). The fact that in each case it actually bottoms out at a value of approximately 0.05 -- that is, the α value: what we initially chose as the chance of a Type I error (rejection when it's balanced) -- gives us confidence that the simulation is giving us accurate results.

So, what is the major lesson here? At moderately low values of n, this test freaking sucks. Look at the chart for n=50 (first one above) and consider, for example, the case where one face never shows up at all (P0=0.00). The test only has an 88% chance of reporting that die as being unbalanced. It's even worse at n=30 (not shown here), which we previously said was a permissible number of rolls for the test; then the power is only about 40%. That is, for n=30, the test only has a 40% chance of telling any difference between a d5 and a d6!

The n=50 d6 power curve has a very gentle bend to it, and what we would like is something with a much sharper dip -- ideally a low chance of rejection at P0=1/6 (17%), a high chance away from it, and as rapid a switchover as possible. For that purpose, n=100 looks a little better, and n=200 even better than that. At n=500 we've really got something: nearly 100% chance of rejection if the special face comes up less than 10% or over 25% of the time. (The PDF shows even sharper power curves for n=1000 and n=2000.)

Let's try that again for a d20 (which would be balanced at a value of P0=0.05):

Here, I didn't even bother to show anything less than n=500, since the curves below that point are just dreadful (shown in the PDF again linked here). For example, at n=100 (previously the nominal minimum number of rolls), the chance of the test detecting the difference between a d19 and a d20 (i.e. one face missing) is only 16%! So in this case, although we have the same low false positive rate of α=5%, we have a sky-high false negative rate of β=84%. While a finding of "unbalanced" is one that we can count on, a finding of "not unbalanced" tells us almost nothing: it would usually do that anyway, even for a die entirely missing one or more faces.

This is honestly not something that I realized before doing the simulation experiment.

Take-away lesson is this, I think: The bare-minimum number of rolls given previously (5 times faces on the die) is pretty much useless for the test to be powerful enough to actually detect an unbalanced die. For a d6, I wouldn't want to use any less than n=100 as a minimum (and ideally something like n=500 if you're serious about it). For a d20, n=500 would be a useful minimum (and at least several thousand to find reasonably small variations). So realize that it takes a lot of rolling to have a chance of actually detecting unbalanced dice; look at the charts above and decide for yourself how small a bias you want to have a chance of identifying.



Postscript: Again, this is an analysis that is frequently overlooked, and if you got through this whole post, then you probably have a deeper understanding of the power of Pearson's chi-square test than even some professional statisticians (I dare say). For example, in the old Dragon magazine article on the subject (Dragon #78, Oct-1983), writer D.G. Weeks completely screwed up on this point. He wrote:
If your chi-square is less than the value in column one (labelled .10), the die is almost certainly fair (or close enough for any reasonable purpose).
Well, that's just totally false. At minimal sample sizes, the test is of such low power, that the die can be almost certainly unfair and still pass the criteria. Furthermore, Weeks presented the possibility of a test for a given suspected die-face frequency and included it in the attached BASIC computer program, in doing so vastly confusing the issue of what's the null and what's the alternative hypothesis. To wit:
In this case it might make more sense to test directly whether this observation is really accurate, rather than simply making the general test described earlier. If what you suspect is true, a specialized test will show the bias more readily...
What I would say is that this would actually prove the bias LESS readily, since your suspicion has now become the null hypothesis, and non-rejection of the null hypothesis tells us next to nothing about the die -- because that's what happens by default anyway, and the test is so very low-powered. In fact, Weeks is making precisely the mistake that we are being warned about by Professor Lowry in the quote at the very top of this blog post (read more at that link if you like: "it is a terrible idea to accept the null hypothesis upon failing to find a significant result in a one-dimensional chi-square test..."). Don't you make the same error!