Hello! My research interests include Human-AI collaboration, synthetic cognition, and model behavior. If you’d like to see more of my work, my website is linked in my profile here. My DMs are always open for curious minds. Additionally, you can find software/tools I've developed and released publicly on GitHub. Also find me on Telegram: @unmodeledtyler
Great list! I've actually been working on something that aligns closely with #3: I've been independently testing LLMs (including Gemini, Grok, DeepSeek etc.) for unexpected behavior under recursive prompt stress. I've been documenting my tests in red-team style forensic breakdowns that show when and how models deviate or degrade through persistent pressure.
The goal of this is absolutely to see and evaluate how agents behave in the wild. I believe that this is a critical safety test that cannot be missed.
I'd be curious to connect with others that are interested in research/testing from this angle.
It's cool to see a role like this open up. I'm curious to see how SLT plays out in practice, especially at scale. I've seen some pretty dramatic shifts in generalization between different versions of the same language model, even just from one quantization to another. Definitely feels like important territory to explore.