Research Statement (Quite long, please read...?)
As a muggle (?), my ultimate goal is to enable Mechanistic Alignment & Grounding for Interactive Cognition (aka MAGIC).
The three constant themes of my research are Language, Interaction, and Embodiment from a scalable and cognitive angle.
I will break it down and elaborate:
Language Grounding and Alignment.
We develop our language systems from natural supervision.
Our language develops through sensorimotor and sociolinguistic experiences in the physical world (semantic/static grounding) and through interactions with others (communicative/dynamic grounding).
We acquire lexical semantics and syntactic structures via this grounded language learning, and we apply our language pragmatically in everyday communication.
To me, grounding is about mapping a language system to something external—whether it be another language, perception, or shared beliefs.
I (mostly) agree with Freda Shi's view on the definition of grounding, and to me, alignment is closely related to grounding but slightly different from it.
To me, alignment is two-fold: in-vocabulary alignment (intent/value/preference/safety alignment) and out-of-vocabulary alignment (aka expanding the action space into multimodal/multilingual/code tokens).
I (together with my colleagues) have a tutorial on grounding and wrote something on alignment.
I will find some time to put down my thoughts on the difference between alignment and grounding (TODO list +1), but in short I think grounding is a property of our language representations but alignment includes a bit more on how we generate given representations.
Grounding and alignment: connecting language to everything non-linguistics.
Mechanistic (Mis)alignment. In my view, the goal of cognitive science is to understand the underlying mechanisms that give rise to intelligence.
I regard humans and machines as fundamentally distinct intelligent systems,
and I believe there will come a point where human-like learning will no longer offer meaningful insights for superhuman AI models.
My ultimate research question centers on what I refer to as "mechanistic (mis)alignment": investigating which factors drive shared cognitive behaviors between humans and machines,
and which mechanistic differences account for their divergent cognitive behaviors.
I always remind myself to be epistemologically rigid and avoid anthropomorphizing AI models as well as overclaiming.
I (mostly) agree with Alex Warstadt and Sam Bowman's view on What Artificial Neural Networks Can Tell Us About Human Language Acquisition.
Scalable (data-driven but sufficiently efficient learning of) representations as computational abstractions of cognition.
The Connections
This is how I perceive the connections between the pieces.
-
Referential Grounding
-
Spatial/3D Grounding
-
Visual Concept Manipulation
-
Theory of Mind (ToM)
-
Trials, Errors, Demos (TED)
-
Active Learning (AL)
Acknowledgement: Thanks to Jiayuan Mao for this amazing template!
Get In Touch
You are welcome to drop me a message :)