There was something else going on, though. The AI was crafting super weapons that the designers had never intended. Players would be pulled into fights against ships armed with ridiculous weapons that would cut them to pieces.
Checking into this one, I don’t think it’s a real example of learning going wrong, just a networking bug involving a bunch of low-level stuff. It would be fairly unusual for a game like Elite Dangerous to have game AI using any RL techniques (the point is for it to be fun, not hard to beat, and they can easily cheat), and the forum post & news coverage never say it learned to exploit the networking bug. Some of the comments in that thread describe it as random and somewhat rare, which is not consistent with it learning a game-breaking technique. Eventually I found a link to a post by an ED programmer Mark Allen who explains what went wrong with his code: https://forums.frontier.co.uk/showthread.php?t=256993&page=11&p=4002121&viewfull=1#post4002121
...Prior to 1.6/2.1 the cached pointer each weapon held to its data was a simple affair pointing at a bit of data loaded from resources, but as part of the changes to make items modifiable I had to change this so it could also be a pointer to a block of data constructed from a base item plus a set of modifiers—ideally without the code reading that data caring (or even knowing) where it actually came from and therefore not needing to be rewritten to cope. This all works great in theory, and then in practice, up until a few naughty NPC’s got into the mix and decided to make a mess. I’ll gloss over a few details here, but the important information is that a specific sequence of events relating to how NPCs transfer authority from one players’ machine to another, combined with some performance optimisations and an otherwise minor misunderstanding on my part of one of the slightly obscure networking functions got the weapon into an odd state. The NPC’s weapon which should have been a railgun and had all the correct data for a railgun, but the cached pointer to its weapon data was pointing somewhere else. Dangling pointers aren’t all that uncommon (and other programmers may know the pains they can cause!) but in this case the slightly surprising thing was that it would always be a pointer to a valid WeaponData...It then tells the game to fire 12 shots but now we’re outside the areas that use the cached data, the weapon manager knows its a railgun and dutifully fires 12 railgun shots :) . Depending on which machine this occurred on exactly it would either be as a visual artefact only that does no damage, or (more rarely but entirely possible) the weapon would actually fire 12 shots and carve a burning trail of death through the space in front of it. The hilarious part (for people not being aimed at) is that the bug can potentially cause hybrids of almost any two weapons… In my testing I’ve seen cases of railguns firing like slugshots, cannons firing as fast as multicannons, or my favourite absurd case of a Huge Plasma Accelerator firing every frame because it thought it was a beam laser… Ouch.
(I would also consider the mascara example to not be an example of misbehaving but dataset bias. The rest check out.)
Checking into this one, I don’t think it’s a real example of learning going wrong, just a networking bug involving a bunch of low-level stuff. It would be fairly unusual for a game like Elite Dangerous to have game AI using any RL techniques (the point is for it to be fun, not hard to beat, and they can easily cheat), and the forum post & news coverage never say it learned to exploit the networking bug. Some of the comments in that thread describe it as random and somewhat rare, which is not consistent with it learning a game-breaking technique. Eventually I found a link to a post by an ED programmer Mark Allen who explains what went wrong with his code: https://forums.frontier.co.uk/showthread.php?t=256993&page=11&p=4002121&viewfull=1#post4002121
(I would also consider the mascara example to not be an example of misbehaving but dataset bias. The rest check out.)