Research Statement (Quite long, please read...?)
					As a muggle (?), my ultimate goal is to enable Mechanistic Alignment & Grounding for Interactive Cognition (aka MAGIC).
					The three constant themes of my research are Language, Interaction, and Embodiment from a scalable and cognitive angle.
					I will break it down and elaborate: 
				
					Language Grounding and Alignment. 
					We develop our language systems from natural supervision.
					Our language develops through sensorimotor and sociolinguistic experiences in the physical world (semantic/static grounding) and through interactions with others (communicative/dynamic grounding). 
					We acquire lexical semantics and syntactic structures via this grounded language learning, and we apply our language pragmatically in everyday communication. 
					To me, grounding is about mapping a language system to something external—whether it be another language, perception, or shared beliefs.
					I (mostly) agree with Freda's view on the definition of grounding, and to me, alignment is closely related to grounding but slightly different from it. 
					To me, alignment is two-fold: in-vocabulary alignment (intent/value/preference/safety alignment) and out-of-vocabulary alignment (aka expanding the action space into multimodal/multilingual/code tokens). 
					I (together with my colleagues) have a tutorial on grounding and wrote something on alignment.
					I will find some time to put down my thoughts on the difference between alignment and grounding (TODO list +1), but in short I think grounding is a property of our language representations but alignment includes a bit more on how we generate given representations.
					Grounding and alignment: connecting language to everything non-linguistics.
					
						
					Mechanistic (Mis)alignment. In my view, the goal of cognitive science is to understand the underlying mechanisms that give rise to intelligence. 
					I regard humans and machines as fundamentally distinct intelligent systems, 
					and I believe there will come a point where human-like learning will no longer offer meaningful insights for superhuman AI models. 
					My ultimate research question centers on what I refer to as "mechanistic (mis)alignment": investigating which factors drive shared cognitive behaviors between humans and machines, 
					and which mechanistic differences account for their divergent cognitive behaviors.
					I always remind myself to be epistemologically rigid and avoid anthropomorphizing AI models as well as overclaiming.
					I (mostly) agree with A roadmap for reverse-engineering the infant language-learner and What Artificial Neural Networks Can Tell Us About Human Language Acquisition.
					Scalable (data-driven but sufficiently efficient learning of) representations as computational abstractions of cognition.
					
						
The Connections
This is how I perceive the connections between the pieces.
- 
						Learning Crossmodal Correspondence
- 
						Learning Underlying World Representation
- 
						Learning Visual Concept Manipulation
- 
						Theory of Mind (ToM)
- 
						Trials, Errors, Demos (TED)
- 
						Proactivity and Steerability
Acknowledgement: Thanks to Jiayuan Mao for this amazing template!
Get In Touch
You are welcome to drop me a message :)
