Independent, self-taught AI safety researcher. I’ve spent the last few years mostly alone stress-testing language models, building the Pandora Benchmark, and trying to turn a suspicious amount of controlled madness into a usable theories of runtime alignment, adversarial evaluation, and model behaviour under pressure.
Georgi Shopov
Independent, self-taught AI safety researcher. I’ve spent the last few years mostly alone stress-testing language models, building the Pandora Benchmark, and trying to turn a suspicious amount of controlled madness into a usable theories of runtime alignment, adversarial evaluation, and model behaviour under pressure.