clipmenu lets you configure the history amount—I set it to 1000. also you can save snippets to paste (accessed via a different hotkey) - e.g. your email address, email templates, common code snippets in developer console
also, a possible alternative to karabiner for vim users is wasavi, which lets you use vim in browser textareas
this benchmark is great! one possible takeaway is that you need to audit the entire process and not just code. in the auditing regime i imagine, the auditor should get to see the red-teamer’s thinking traces. if a model could, without verbalizing anything about sabotage, produce a sabotaged codebase, then i think it’s more fair to ask whether a blue team of similar capability could catch it