Research Statement (Quite long, please read...?)

As a muggle (?), my ultimate goal is to enable Mechanistic Alignment & Grounding for Interactive Cognition (aka MAGIC). The three constant themes of my research are Language, Interaction, and Embodiment from a scalable and cognitive angle. I will break it down and elaborate:

Language Grounding. The ultimate aim of alignment research revolves around grounding, whether through multimodal alignment or intention/value alignment. Our language develops through sensorimotor and sociolinguistic experiences in the physical world (semantic/static grounding) and through interactions with others (communicative/dynamic grounding). We acquire lexical semantics and syntactic structures via this grounded language learning, and we apply our language pragmatically in everyday communication. To me, grounding is about mapping a language system to something external—whether it be another language, perception, or shared beliefs.

Grounding and alignment: connecting language to everything non-linguistics.
  • Grounding language to the physical world: Understanding and generating language that is grounded to sensorimotor experiences and physical situations. There are more to look at beyond 2D grounding, e.g., video, 3D, generative world models.
  • Grounding language to human interactions: (Co-)situated Human-AI interaction in shared environment with disparate mental states, and collaborations towards a common ground.
  • Alignment in post-training and at inference time: Human-like planning and reasoning that is deliberate, (inter)active, lifelong, and steerable upon pre-trained systems.
  • Applications of frontier models in situated/embodied agents as well as content generation.
  • I (mostly) agree with Freda Shi's view on the definition of grounding, and grounding != alignment. We wrote something on alignment, and I will find some time to put down my thoughts on the difference between alignment and grounding (TODO list +1)..


Mechanistic (Mis)alignment. In my view, the goal of cognitive science is to understand the underlying mechanisms that give rise to intelligence. I regard humans and machines as fundamentally distinct intelligent systems, and I believe there will come a point where human-like learning will no longer offer meaningful insights for superhuman AI models. My ultimate research question centers on what I refer to as "mechanistic (mis)alignment": investigating which factors drive shared cognitive behaviors between humans and machines, and which mechanistic differences account for their divergent cognitive behaviors.

Scalable (data-driven but sufficiently efficient learning of) representations as computational abstractions of cognition.
  • Scaling law and developmental psychology: Exploring the developmental trajectories of data-driven models and the emergent cognitive capabilities over the course of development.
  • Efficient learning with minimal supervision: Learning that is data-efficient, over multiple modalities, on (semi-)structured data.
  • I (mostly) agree with Alex Warstadt and Samuel Bowman's view on What Artificial Neural Networks Can Tell Us About Human Language Acquisition.
  • Cross-cultural and cross-lingual conventions of cognition. (Languages are dying! Under-represented languages are dear to my heart but I plan (try hard) not to do (too much) research on this topic before I finish my PhD lol)
  • The Connections

    This is how I perceive the connections between the pieces.

    Language-centered blueprint: grounding language to physical embodiment and interactions

    Vision and Physical Embodiment

    Show/Hide Work on Semantic Grounding

    Interaction with Humans and Other Agents

    Show/Hide Work on Communicative Grounding



    Cognitively-centered blueprint: the mechanistic (mis)alignment between humans and machines

    To be updated.

    Acknowledgement: Thanks to Jiayuan Mao for this amazing template!

    Get In Touch

    You are welcome to drop me a message :)

    • Phone

      xxx-xxx-xxxx
    • marstin0607
    • ziqiao_ma
    • Address

      Bob and Betty Beyster Building 4909,
      2260 Hayward Street,
      Ann Arbor, MI 48109.