Monday, March 4, 2019

More Missile Modeling

I've written so many letters on the physics and statistics of missiles, archery, and ballistics that it could sink a warship (search the blog, you'll see). So much so, sometimes it's easy to lose the plot at this time. I figured I'd summarize some of our findings to date.

We have two primary sources of data. One is from Longman and Walrond, Archery (1894) -- as reported by Barrow in Dragon #58, "Aiming for realism in archery" (Feb. 1982). He writes (first noted on the blog here):

English archers use a 48-inch-diameter target in tournament competition... A compilation of the twelve highest tournament results during a one-year period shows that the “hit” percentages of England’s finest archers at three ranges were: 92% hits at 60 yards, 81% at 80 yards, and 54% hits at 100 yards distance.  

A second source of data is from more recent UK "clout" long-distance longbow competitions. Results from a competition in 2016 show that at a range of 180 yards, competitors hit a 12-foot radius target 42% of the time, and an 18-inch radius central target only 1% of the time. (Full data and spreadsheets on the blog here.)

Using that as a guideline, we've developed a simple physical simulation to model archery shots, using an idea I first saw in Conway-Jones, "Analysis of Small-Bore Shooting Scores", Journal of the Royal Statistical Society (1972). The idea is fairly simple: model shooting error in both the x- and y-axes directions as two independent normal curves, which we call the "bivariate normal distribution". (First noted on the blog here.)

The simulation of that is written as a Java program and posted to a public code repository at GitHub (here). If we run that program with settings of precision = 6.7 (extremely high skill!), target radius = 2 feet, and long output form (that is: parameters 6.7 2 -L), then we get to-hit results very close to the 1894 Archery figures (compare highlights to quote above):


Likewise, if we run the program with precision = 1.6, target radius = 12 feet (parameters 1.6 12 -L), then we get results very close to the recent UK clout tournaments:


Also, if we set the target radius of this latter experiment to 1.5 feet (that is, 18 inches), then the hit rate at 180 yards becomes 1%, exactly as seen in the real-world data. Comparing these two data sources, we might be led to think that English archery skill has dropped off precipitously between 1894 and 2016 (precision 6.7 in the former and 1.6 in the latter). But based on the short quote regarding the first data source, we might say that it was cherry-picking its data; the best dozen results across all tournaments in England in a year. Contrast that with the second data source which includes all 30 competitors in one single tournament, whether they performed well or not. So the jury is still out on that issue.

That ends the recap. Now for a new thought: What is the best statistical model for these numbers? Clearly it's capped above and below: the chance to hit (or miss) cannot possibly be more than 100%, or less than 0%. Presumably we want a smooth, continuous curve, and one that can theoretically handle any arbitrary distance. Effectively we have just given the definition for a sigmoid curve, that is, an S-shaped curve seen in many probability cumulative distribution functions. The simplest model for this is the logistic function, as applied in logistic regression analysis.

One problem with this observation is that logistic regression of this sort is not built into standard spreadsheet programs (Libre Office, Excel) like many other types are (linear, polynomial, exponential, etc.) So what I've done below is this: Used the model derived from 2016 clout shooters (second experiment above; precision = 1.6, set target radius = 2 feet); increased granularity of the output to increments of 2 yards (for added detail); converted hit chances to miss chances (because the logistic curve expects numbers to be increasing from left-to-right), and used the online Desmos graphing calculator site (here; thanks immensely, guys!) to regress it to a logistic function. We get the best possible fit as follows:


Note that our regression (orange curve) has an R² = 95.87% match with the numbers from our simulated physical model of UK long-distance clout shooters (black dots). One possible downside: the logistic formula shown in the bottom-left is probably too complicated to use in a standard D&D gaming session. However, a second observation occurs to us: in the central part of that curve, at distances from around 20 to 40 yards (that is, ignoring the parts that are close to 0% or 100%; i.e., the part with maximal rate-of-change), the curve is practically a straight line.

Let's find an approximating line for that "critical" part of the curve. Our regression formula generates the points (20, 0.28) and (40, 0.73) -- so, this is the region where hit-or-miss chances vary from about 25% to about 75%. Solving for an equation of a line through those points (using Wolfram Alpha or good ol' college algebra) gives: y = 0.0225x − 0.17. Note the slope m = 0.0225, which means the chance to hit drops by 2.25% per yard on that region. Converting to feet we get 0.0225/3 = 0.0075, so: 0.75% per foot, or 7.5% per 10 feet. Note that this is freakishly close to the 7.6% per 10 feet figure we saw in the Milks spear-throwing experiment a few weeks ago.

In conclusion: It seems like our data and multiple models are telling us that there's a consistent dropoff in hit rates of around 7.5% per 10 feet, in the part of the range where it matters (neither a near-automatic hit or miss). This is why in the last few months in my D&D game I've shaved this number off to 5% and simply said there's a −1 chance to hit per 10 feet, on a d20 attack roll. But how to account for the extended upper and lower parts of the sigmoid S-curve distribution (where the chances are almost, but not quite, 0% or 100%)? Well, the classic rule to auto-miss on natural "1" and auto-hit on "20" (or something close to that: say they count as −10 or +30) does a fair job of recreating the rest of that model.

(P.S. Keep in mind that the exact hit-or-miss numbers shown above assume a single unmoving, undefended, man-size target of radius 2 feet or so. In practice, we need all kinds of extra modifiers to account for aware, defensive men in the field; shooting at a clustered army of bodies; and so forth. But from what we can tell the specific range modifiers increments would be generally consistent regardless of other considerations.)

15 comments:

  1. Just a tclarification... The first −1 chance to hit is after each 10 feet? I mean, shooting at someone at 8 feet has +0 or -1 chance to hit?

    ReplyDelete
    Replies
    1. Good point: I would definitely round penalty down, so at 8 feet I'd say +0.

      Delete
    2. Oddly, on the car ride in this morning I was asking myself "wonder if Delta has any research/thoughts on using ranged weapons on adjacent/close range foes." Point blank shot, as it were....

      Delete
    3. So in looking at this data chart, it looks like a penalty to hit doesn't really kick in until 10yards/30feet (If I am understanding it correctly)
      Since in my own games, I tend to focus on "in dungeon" use of weapons, I might wave penalties for the first 30 feet, or in some way reserve those penalties for "field" use....
      Not sure.

      Delete
    4. You're right, and I've totally wavered on the fence of that in exactly the same way. I went with the flat -1/10' for simplicity. Having it staggered at 30' (so, subtract then divide) seemed like something I was very likely to forget about, or hard to explain, in-game.

      In some sense that quasi-balances out with being generous in rounding down from 7.5% to 5% per ten feet. E.g., up at 40 yds = 120 ft, the chart suggests it should be -70%, and we're only penalizing -12/20 = -60%, so still overall biased to generosity to the shooter.

      Delete
    5. Interesting how all the time, it keeps looking more and more ranged hit probabilities would be most elegant when using dungeon tiles or battle mats with squares of 3 1/3 feet (or 1 meter or 1 yard, if easy approximations are preferred). That would then let you say that 'short range' is within 10 squares, accuracy suffers a -1 penalty for every 2 squares beyond that (as -5% every 6 2/3 feet is equivalent to -7.5% every 10 feet), and thrown weapons have a maximum range of 20 squares.

      Delete
    6. That's a fair point, actually.

      Delete
  2. Amazing stuff!
    If you did not explain at the end, I would be completely lost (undergrad in literary studies and linguistics, no maths, even if I did one semester of physics, never really got past the first few exams).
    So thank you once more for this enlightening and inspiring blog.
    Your "official house rules" (OED) have inspired me to (re)write my ruleset (titled: "Smasher & Devourer", naming two of its most fearsome monsters) excluding the 'Healer' class and applying a progressive penalty per distance for missile attacks.
    And, obviously, I am using Target 20.
    This will be run on a second table (the first one we discussed on your post about best algorithms is still stuck with THAC0).
    This new table loved the cleric-less setting and seem passionate about how easy T20 is.
    So, once more, cheers from Brazil.
    And keep up the great work.

    ReplyDelete
    Replies
    1. Your games sound fun! (Would like to see that ruleset.)

      I think THAC0 is great, too, if you cut the subtraction and treat it instead like Target 20, i.e., so it's just target 19, target 16, etc. It then even has a little less math at the table as you don't have to add in a level attack bonus. (That said, there's simply no beating Target20 for monsters.)

      Delete
    2. Igor and Scott: Thanks so much for the kind words! I agree, Igor's game sounds excellent, I'd love to see his custom rules too, and so glad you're using Target 20. :-)

      Delete
    3. If Scott Keeney ever sees this comment, could you please explain in full detail how this "target 19, 16, …" system derived from THAC0 works?
      Thank you two in advance.
      I will be making a Blogspot for "Smasher & Devourer".
      And I plan on translating my ruleset to english eventually.
      So far, so good.
      Cheers!

      Delete
    4. Really it's simply the way THAC0 has always worked, though rulebooks have often explained it poorly. The idea is that as you level, instead of gaining a bonus to attack rolls, you lower your target number. So at THAC0 20 your roll on a d20 + target's AC + any bonuses or penalties must equal or exceed 20. At THAC0 19 they must be ≥19, etc. The level bonus is just pre-subtracted from the target number instead of adding it to each roll individually.

      Delete
  3. This ↑

    And "rulebooks have often explained it poorly" can't be overstated.

    ReplyDelete
    Replies
    1. [New comment 'cuz I forgot a few lines]

      Oh! So it is just "Target (20 minus PC's level)".
      Easy peasy.
      I've already explained Target 20 for the newer table, so I will stick to it.
      But for the older table (running the godawful THAC0 we all know) I will explain this "Target THAC0" system and try to buy them into it.
      Thank you Scott and Daniel for your patience and collaboration.

      Delete
    2. I agree, that's essentially it. (Note for non-fighters you're subtracting a fraction of the level, like maybe half the wizard level or three-quarters the cleric level).

      The other thing about THACO is that traditionally people keep a full table of possible ACs on the PC sheet, subtracting the given AC in each case. Then any time they get a bonus they subtract that from the THACO -- so the idea really puts you on the path to subtracting all bonuses all the time (instead of directly adding to the die roll).

      Delete