Boeing 737 MAX MCAS as an agent corrigibility failure

The Boe­ing Ma­neu­ver­ing Char­ac­ter­is­tics Aug­men­ta­tion Sys­tem (MCAS) can be thought of, if reach­ing a bit, as a spe­cial­ized AI: it performs a func­tion nor­mally re­served for a hu­man pi­lot: pitch­ing the nose down when it deems the an­gle of at­tack to be dan­ger­ously high. This is not, by it­self, a prob­lem. There are pi­lots in the cock­pit who can take con­trol when needed.

Only in this case they couldn’t. Sim­ply man­u­ally pitch­ing the nose up when the MCAS pitch­ing it down too much would not dis­en­gage the sys­tem, it would ac­ti­vate again, and again. One has to man­u­ally dis­en­gage the au­topi­lot (this in­for­ma­tion was not in the pi­lot train­ing). For com­par­i­son, think of the cruise con­trol sys­tem in a car: the mo­ment you press the brake pedal, it dis­en­gages; if you push the gas pedal, then re­lease, it re­turn to the pre­set speed. At no time it tries to over­ride your ac­tions. Un­like MCAS.

MCAS dis­re­gards crit­i­cal hu­man in­put and even fights the hu­man for con­trol in or­der to reach its goal of “nom­i­nal flight pa­ram­e­ters”. From the Cor­rigi­bil­ity pa­per:

We say that an agent is “cor­rigible” if it tol­er­ates or as­sists many forms of out­side cor­rec­tion, in­clud­ing atleast the fol­low­ing: (1) A cor­rigible rea­soner must at least tol­er­ate and prefer­ably as­sist the pro­gram­mers in their at­tempts to al­ter or turn off the sys­tem...

In this case the “agent” ac­tively fought its hu­man han­dlers in­stead of as­sist­ing them. Granted, the defi­ni­tion above is about pro­gram­mers, not pi­lots, and the ex­ist­ing MCAS prob­a­bly would not fight a soft­ware up­date, be­ing a dumb spe­cial­ized agent.

But we are not that far off: a lot of sys­tems in­clude built-in se­cu­rity checks for the re­mote up­dates. If one of those checks were to ex­am­ine the al­gorithm the up­dated code uses and re­ject it when it deems it un­ac­cept­able be­cause it fails its in­ter­nal checks, the cor­rigi­bil­ity failure would be com­plete! In a life-crit­i­cal always-on sys­tem this would pro­duce a mini-Skynet. I don’t know whether some­thing like that has hap­pened yet, but I would not be sur­prised if it has, and re­sulted in catas­trophic con­se­quences.