Call for submissions: “(In)human Values and Artificial Agency”, ALIFE 2023

Link post

key points:

  • Cash prize of $500 for the best presentation.

  • Deadline 3 March, 2023.

  • Organized by Simon McGregor (University of Sussex), Rory Greig (DeepMind), Chris Buckley (University of Sussex)

ALIFE 2023 (the 2023 conference on Artificial Life) will feature a Special Session on “(In)human Values and Artificial Agency”. This session focuses on issues at the intersection of AI Safety and Artificial Life. We invite the submission of research papers, or extended abstracts, that deal with related topics.

We particularly encourage submissions from researchers in the AI Safety community, who might not otherwise have considered submitting to ALIFE 2023.

...

EXAMPLES OF A-LIFE RELATED TOPICS

Here are a few examples of topics that engage with A-Life concerns:

  • Abstracted simulation models of complex emergent phenomena

  • Concepts such as embodiment, the extended mind, enactivism, sensorimotor contingency theory, or autopoiesis

  • Collective behaviour and emergent behaviour

  • Fundamental theories of agency or theories of cognition

  • Teleological and goal directed behaviour of artificial agents

  • Specific instances of adaptive phenomena in biological, social or robotic systems

  • Thermodynamic and statistical-mechanical analyses

  • Evolutionary, ecological or cybernetic perspectives

EXAMPLES OF AI SAFETY RELATED TOPICS

Here are a few examples of topics that engage with AI Safety concerns:

  • Assessment of distinctive risks, failure modes or threat models for artificial adaptive systems

  • Fundamental theories of agency, theories of cognition or theories of optimization.

  • Embedded Agency, formalizations of agent-environment interactions that account for embeddedness, detecting agents and representations of agents’ goals.

  • Selection theorems – how selection pressures and training environments determine agent properties.

  • Multi-agent cooperation; inferring /​ learning human values and aggregating preferences.

  • Techniques for aligning AI models to human preferences, such as Reinforcement Learning from Human Feedback (RLHF)

  • Goal Misgeneralisation – how agent’s goals generalise to new environments

  • Mechanistic interpretability of learned /​ evolved agents (“digital neuroscience”)

  • Improving fairness and reducing harm from machine learning models deployed in the real world.

  • Loss of human agency from increasing automation