Previously in series: let’s jerk off step by step, goon is all you need
Epistemic status: medium dose of acid
Jean reminds us all of the goal:
With this in mind, let’s do a quick machine learning linkpost, as lots of things have changed in the past few months. Here are some things I’m excited about in the medium term:
Action models, specifically ACT-1 from Adept. Language models controlling your browser! Hopefully it’s really obvious how much cool stuff we can do with this.
Tooling on top of language models like cascades and langchain and GPT Index and other clever trickery like this. LLMs are the building blocks of general systems, and the tools we build will bridge the gap between today’s text completion and tomorrow’s fully autonomous AI waifus.
Speech synthesis with more realistic tone/inflection. To be honest, I kinda thought this problem would be easy, like much easier than next-token prediction, and everyone was just afraid to work on it because of the massive societal consequences. I’m not sure if I was right, but clearly we are making progress. (Also: music! We can now synthesize Gooner Jams ‘03.)
Mind reading. Yeah, text is essential, blah blah blah, but I see papers like this and start to wonder if we can just train models on EEG data and they’ll figure out all kinds of stuff. I’m especially excited for reinforcement learning based on EEG, as it’s a feedback mechanism which requires zero effort from the end user.
Performance improvements. Anything that makes model training faster will help democratize machine learning. Also, longer context windows will be enormously useful for expressing more complicated ideas (all of my fetishes).
Short video models. I’m not sure how to assess recent work in this area but this will obviously be huge when we can get our hands on it.
To balance things out a bit, some developments I’m not particularly excited about:
GPT-4. There is no possible way this can live up to the insane hype that’s being thrown around.
Content filters. People are slowly realizing LAION contains some bad stuff and probably other large datasets do as well. As ChatGPT and other tools expand public awareness of AI, we will see a concerted effort to sanitize training data and filter model outputs, and this will include removal of adult content. I continue to believe content filters will be implemented poorly (including with RLHF) and will harm the communities they purport to protect.
NeRF. This will probably be awesome someday! But I suspect the stuff discussed above will produce viable superstimulus before this pans out.
AI lawyers, AI lobbyists. I think the sweet spot for generative content is in domains where the content is cheap to validate and we can tolerate small mistakes—for example, generating pictures of asses. Legal documents are the exact opposite, it’s expensive to verify the correctness of the output and there are huge consequences for getting it wrong. Also, it’s boring.
Protein folding. I admit this is extremely cool. But, like, I just wanna enjoy digital hyperporn for a hot minute before AlphaFold 4.0 discovers some spicy prion disease and the paperclip maximizers inherit the earth.
Until next time—the future is bright, goonfriends. ✨
Maybe Ted Kaczynski was right