LLMs, Part 2

Taken from “A Path Towards Autonomous Machine Intelligence”, by Yann LeCun

Introduction

Following the formulation in Distilling System 2 into System 1, given an input $x$, System-1 produces the output $y$ directly: $S_{\mathrm{I}}(x)=f_{\theta}(x)\to y$. In contrast, System-2, takes an LLM $f_{\theta}$ and input $x$ and generates intermedaite tokens $z$: $S_\text{II}{(x;f_\theta)}\to z,y$, which can be seen as a form of meta learning. I plan to design scalable System-2 LLMs from the meta learning perspective.

My Related Works

Neural Architectures

Multi-Modal & Reinforcement Learning

Human-AI Alignment

Meta Learning

Jie Fu
Jie Fu
Research Scientist

Focus on Deep Learning, System-2 Language Models, Reinforement Learning, Agent, AI Safety.