Would be cool to have a playground or a daily challenge with a code golfing equivalent for a shortest possible LLM prompt to a given answer.
That could help build some neat understanding or intuitions.
Martin Vlach
[Question] Would it be useful to collect the contexts, where various LLMs think the same?
I’ve found the level of self-allignment in this one disturbing: https://www.reddit.com/r/bing/comments/113z1a6/the_bing_persistent_memory_thread
Would be cool if a link to https://manifund.org/about fit somewhere in the beginning of there are more readers like me unfamiliar with the project.
Otherwise a cool write-up, I’m a bit confused with Grant of the month vs. weeks 2-4 which seems a shorter period..also not a big deal though.
Q Draft: How does the convergent instrumental goal of gathering work for information acquisition?
I would be very interested if it implies space(&time) exploration for advanced AIs...
If we build a prediction model for reward function, maybe an transformer AI, run it in a range of environments where we already have the credit assignment solved, we could use that model to estimate what would be some candidate goals in another environments.
That could help us discover alternative/candidate reward functions for worlds/envs where we are not sure on what to train there with RL and
it could show some latent thinking processes of AIs, perhaps clarify instrumental goals to more nuance.
My guess is that “rental car market” has less direct/local competition while the airlines are centralized on the airport routes and many cheap flight search engines( ex. Kiwi.com) make this a favorable mindset.
Is there a price comparison for car rentals?
what happened at Reddit
could there be any link? From a small research I have only obtained that Steve Huffman praised Altman’s value to the Reddit board.
a Heavy idea to be put forward: general reputation network mechanics, to replace financial system(s) as the (civilisation )standard decision engine.
Would it be worthy to negotiate for readspeaker.com integration to LessWrong, EA forum, and alignmentforum.org?
Alternative so far seems to use Natural Reader either as addon for web browser or copy and paste text into the web app. One more I have tried is on MacOS there is a context menu Services->Convert to a Spoken track which is sligthly better that the free voices of Natural Reader.
The main question stems from when we can have similar functionality in OSS, potentially with better quality of engineering..?
Glad I’ve helped with the part where I was not ignorant and confused myself, that is with not knowing the word engender and the use of it. Thanks for pointing it out clearly. By the way it seems “cause” would convey the same meaning and might be easier to congest in general.
Reading a few texts from https://www.agisafetyfundamentals.com/ai-alignment-curriculum I find the analogy of makind learning goals of love instead of reproductive activity unfitting as to raise offspring takes a significant role/time.
“engender”—funny typo!+)
This sentence seems hard to read, lacks coherency, IMO.
> Coverage of this topic is sparse relative coverage of CC’s direct effects.
As a variation of your thought experiment, I’ve pondered: How do you morally evaluate a life of a human who lives with some mental suffering during a day, but thrives in vivid and blissful dreams during their sleep time?
In a hypothetical adversary case one may even have dreams formed by their desires and the desires be made stronger by the daytime suffering. Intuitively it seems dissociative disorders might arise with a mechanism like this.
Draft for AI capabilities systematic evaluation development proposal:
The core idea here is that easier visibility of AI models’ capabilities helps safety of development in multiple ways.
Clearer situation awareness of safety research – Researchers can see where we are in various aspects and modalities, they get a track record/timeline of abilities developed which can be used as baseline for future estimates.
Division of capabilities can help create better models of components necessary for general intelligence. Perhaps a better understanding of cognitive abilities hierarchy can be extracted.
Capabilities testing can be forced by regulatory policies to put most advanced systems under more scrutiny and/or safe(ty) design support. To state differently: better alignment of attention focus to emerging risk( of highly capable AIs).
Presumably smooth and well available testing infrastructure or tools are a prerequisite here.
The most obvious risks are:
Measure becoming a challenge and a goal, speeding up a furious developments of strong AI systems.
Technical difficulties of testing setup(s) and evaluation, especially handling the factor of randomness in mechanics(/output generation) of AI systems.
Shows only blank white page RN. Mind to update/delete it?
AI-induced problems/risks
possibly https://ai.google.dev/docs/safety_setting_gemini would help or just use the technique of https://arxiv.org/html/2404.01833v1
people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?So you’ve just prompted the generator by teasing it with a rhetorical question implying that there are personal opinions evident in the generated text, right?
Do not say the sampling too lightly, there is likely an amazing delicacy around it.’+)