SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research

Roman Leventov

SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research

Roman Leventov

4 min readJan 2, 2024

Comments 2

Sorted by

New & upvoted

SummaryBot

Executive summary: SociaLLM is a proposed language model architecture for building personalized AI applications, conducting social science research, and pursuing AI safety goals.

Key points:

SociaLLM tracks separate message streams related to conversations, individual users, and user pairs to enable personalization.
It could power apps for comment reordering, recommendations, customer service, education, mental health counseling, media analysis, and more.
The model facilitates research into language, theory of mind, group dynamics, information flow, and collective intelligence.
Studying deception and collusion with SociaLLM may inform techniques to prevent undesirable behavior in AI teams.
Open questions remain around optimally engineering SociaLLM blocks and measuring information content.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Roman Leventov

Announcement

I think SociaLLM has a good chance of getting OpenAI’s “Research into Agentic AI Systems” grant because it addresses both the challenges of the legibility of AI agent's behaviour by making the agent’s behaviour more “human-like” thanks to weight sharing and regularisation techniques/inductive biases described the post, as well as automatic monitoring: detection of duplicity or deception in AI agent's behaviour by comparing agent’s ToMs “in the eyes” of different other interlocutors, building on the work “Collective Intelligence in Human-AI Teams”.

I am looking for co-investigators for this (up to $100k, up to 8 months long) project with hands-on academic or practical experience in DL training (preferably), ML, Bayesian statistics, or NLP. The deadline for the grant application itself is the 20th of January, so I need to find a co-investigator by the 15th of January.

Another requirement for the co-investigator is that they preferably should be in academia, non-profit, or independent at the moment.

I plan to be hands-on during the project in data preparation (cleansing, generation by other LLMs, etc.) and training, too. However, I don’t have any prior experience with DL training, so if I apply for the project alone, this is a significant risk and a likely rejection.

If the project is successful, it could later be extended for further grants or turned into a startup.

If the project is not a good fit for you but you know someone who may be interested, I’d appreciate it a lot if you shared this with them or within your academic network!

Please reach out to me in DMs or at [email protected].

Comments

Curated and popular this week

Hard-to-reverse decisions destroy option value

Stefan_Schubert·9y ago·Curated 1d ago·14m read

This post is co-authored with Ben Garfinkel. It is cross-posted from the CEA blog. A PDF version can be found here. Summary: Some strategic decisions available to the effective altruism m...

Introducing Impact List: a ranking of philanthropists by expected lives saved

Elliot Olds·1d ago·6m read

TL;DR: I'm releasing a website that ranks philanthropists according to EA principles and research, and allows users to re-rank the list using their own assumptions. I'd like feedback and help making it better. I'd especially like ideas for how to make the results more trustworthy. Funding may be available. I recently built Impact List (impactlist.xyz), a site which ranks people by their positive impact via donations. The goal is t...

If you're agentic, work in biosecurity

sharmaayushmaan🔸·5d ago·7m read

Disclaimer: Although I work on the Groups Team at CEA, I’m writing this in a personal capacity, and this post does not constitute an endorsement by CEA. Agency - the realisation that you really can just do things. TL;DR Biosecurity needs people (of any background) who are agentic and have a high execution velocity and track record....

Recent opportunities to take action

Marginal Victories: career advising and opportunities for U.S. democracy preservation & political work

Annika Burman 🔸·3d ago·2m read

I'm stepping down as Hive's Executive Director, and we're hiring my successor

SofiaBalderson, Hive·4d ago·3m read

Starting an EA group @ SUNY Binghamton

micahzarin·2d ago·1m read

Roman Leventov

Announcement

Another requirement for the co-investigator is that they preferably should be in academia, non-profit, or independent at the moment.

If the project is successful, it could later be extended for further grants or turned into a startup.

If the project is not a good fit for you but you know someone who may be interested, I’d appreciate it a lot if you shared this with them or within your academic network!

Please reach out to me in DMs or at [email protected].

Architecture and training

SociaLLM is^[1] a foundation language model to be trained on chat, dialogue, and forum data where the identities of message authors (called "users" below) are stable and all messages have timestamps so we can have a global order of them.

SociaLLM design builds upon the Mamba architecture which is a language model with so-called state-space modelling (SSM) blocks instead of self-attention blocks. The model combines SSM blocks that track three separate message streams:
(1) the "local conversation"/flow of messages (which is exactly the training regime of the current LLMs);
(2) the message history of the particular user as well as their general "reading history", which in the forum data could be approximated as previous N (1-10) messages before every user's message;
(3) the message history of the particular interlocutor of the user, which is the subset of the general "reading history" from the previous point, authored by a particular other user.

Training this model would cost from 2 times (on a purely 1-1 dialogue data) to ~10-15 times (on chat room and forum data where messages from the most active users tend to be mixed very well) more than the training of the current LLMs. The data should be wrangled to create training sub-datasets from the perspective of each user pair, but otherwise, the training shouldn't be much fancier or more complicated than the current distributed training algorithms for LLMs (it seems to me).

The first upside of this model is that we can create (what seems to be) strong inductive biases towards developing a large self-other overlap (see also this AI Safety Camp project by AE Studio):
(1) connecting the "user's own" SSM blocks and interlocutor's SSM blocks into the residual stream symmetrically (maybe just through parallel connection, as in multi-head attention);
(2) using the same weights for the user's own and interlocutor's SSM blocks^[2] (at inference time blocks are separate and track states separately, but their weights are the same and updated in lockstep batch after batch); and
(3) probably some extra regularisation techniques, such as intermittent "forgetting" of the either user's own or interlocutor's state (which is not completely unlike some real-world situations for humans: sometimes people tell us that we met before but we don't remember them) and thus teaching the model to degrade gracefully under these circumstances.

Industrial applications

As I already mentioned at the beginning of the post, I originally thought about this model as a base model that can be fine-tuned to predict whether the human user will find this or that information novel, insightful, boring, helpful, saddening, fun, and so on. This fine-tuned model, in turn, could be used within a browser extension to reorder comments on websites (YouTube, Reddit, Facebook, Twitter feed or replies, NYT, The Guardian, etc.) to order the "good" or "informationally valuable" comments first, which (I hope) should change the dynamics of the online echo chambers.

More generally, SociaLLM can improve almost all applications that currently use LLMs and for which personalisation the raw reasoning and creative power: personalised content recommendations and filtering, customer service and engagement, education and language learning assistants, mental health and personal counselling (a-la Pi AI).

In the media and entertainment industries, SociaLLM could also be helpful in narrative analysis (for mass media products, such as movies and novellas) and interactive storytelling for the new forms of media and games.

There are also possible applications that enhance the collective intelligence of teams:

An add-on for team chat platform (such as Slack) that spots the discrepancy of knowledge (or opinion) between team members as described in the paper "Collective Intelligence in Human-AI Teams: A Bayesian Theory of Mind Approach" (Westby & Reidl, 2023).

A conflict resolution app for teams, friend groups, and families.

Research and AI safety applications

The value of SociaLLM in social science research should be obvious: it could be directly used for research and experiments in language intentionality, Theory of Mind, social group or team dynamics, etc.

Beware: the discussion below is somewhat above my pay grade in terms of statistics and ML theory. Take it with a grain and salt, and if something looks to you wrong in it, please point it out.

Collective intelligence mechanisms and research (such as "Collective Intelligence in Human-AI Teams" mentioned above) often require the measure of the information content of the messages that agents send to each other. For SociaLLM to provide such a measure, the user's own and interlocutor's SSM blocks must use the same weights (as suggested above), so we can these SSM blocks as producing the same state representation structure.

Also, for such an informational measure, the SSM blocks should simultaneously provide the energy measure of the current state, i.e., the SSM blocks should simultaneously be Energy-Based Models (EBMs). I'm not sure how to engineer this into SSM blocks. Maybe the techniques from the "Recurrent Neural Filters" paper (Lim, Zohren, and Roberts, 2020) should help, where the Error Correction term aka auto-encoding (posterior) error can be used as the current state's energy. If you have other ideas on how to turn SSM models into (quasi-)energy-based models (or better yet, Bayesian models, but this seems a taller order), please share.

On the AI safety front, SociaLLM could also be used to study (social) deception (e.g., when analysing Diplomacy game logs) and collusion, and, perhaps, help to engineer and test the mechanisms to disincentivise or prevent deception and collusion in AI teams aka agencies.

^{^}

Note: this is a proposal, the model hasn't been trained (or even designed in detail) yet!

^{^}

This feature of the architecture is also important for measuring the information content of the messages in collective intelligence mechanism design and collusion and deception detection, as explained in the section "Research and AI safety applications" below.