About

I believe the real value of large language models lies not in answering everything, but in being trustworthy and useful in specific contexts.

I graduated from East China University of Science and Technology, majoring in Automation, and currently work as a Senior Algorithm Engineer at Cylingo Group. I lead the R&D of the Xinyuan series of domain-specific large language models. Over the past few years, I have focused on affective intelligence and domain-specific large model deployment, building full-cycle pipelines from data curation and continual pre-training to SFT, RLHF, model deployment, and community operations across scenarios including mental health, healthcare, retail, and community applications.

I led projects including Xinyuan-LLM, Xinyuan-VL, MindChat, Sunsimiao, ColugoMum, and OXiaoPeng. Across the open-source projects I have led and contributed to, my work has collectively earned 20,000+ GitHub Stars and appeared multiple times on GitHub Trending. I have also published 1 book on LLM application development, 4 papers, and hold 1 utility model patent and 3 software copyrights. My work has received 7 national-level awards and over 10 provincial/municipal awards.

The question I keep asking: How can we make AI more usable, safer, and more sustainable in real human scenarios?

News

2026.05: Co-authored LangChain Large Model Application Development: From Beginner to Practice, Tsinghua University Press.
2026.01: First Prize in the 2025 Synthetic Data Competition · Lingxi AI for Mental Health track.
2025.09: Speaker at the 2025 Apsara Conference session “Doing AI, the Gen-Z Way”.
2025.07: Bronze Award in the 2024 China International College Students’ Innovation Competition.
2025.07: Joint interview by Alibaba Cloud Tongyi Lab and ModelScope community.

Experience

Cylingo Group | Senior Algorithm Engineer
March 2024 - Present

Problem: General-purpose LLMs tend to give generic responses in mental health and education scenarios. They lack domain knowledge and struggle with safety and empathy boundaries.

Solution: Led the full-cycle R&D of the Xinyuan series of domain-specific LLMs, covering de-identification, deduplication, quality scoring, continual pre-training, SFT, RLHF, and multi-dimensional evaluation across 45M+ data samples. Coordinated a team of 5-20 algorithm and engineering members.

Results:

Released Xinyuan-VL-2B, which ranked first on OpenCompass in the under-4B parameter category at the time.
Released Xinyuan-LLM-14B-0428, a domain foundation model for mental health and education.
Led external technical evangelism, including a keynote at the Alibaba Cloud ModelScope developer event and a joint interview with Alibaba Cloud Tongyi Lab and ModelScope.

Selected Projects

Health LLM Knowledge Platform

Problem: Mental health and medical consultation require high levels of safety, professionalism, and empathy. Deploying general-purpose models directly carries significant hallucination and safety risks.

Solution: Built a health-focused LLM matrix covering foundation models, knowledge enhancement, and application deployment. Processed millions of medical records and hundreds of thousands of psychological dialogues, totaling 4M+ samples. Applied RAG and a three-dimensional memory system to reduce hallucinations.

Results: Incubated Sunsimiao medical LLM (400+ Stars) and MindChat mental health LLM (700+ Stars). Connected model capabilities to WeChat via OXiaoPeng, reaching 2,000+ direct users and 20,000+ indirect users.

ColugoMum: Smart Retail Settlement Platform

Problem: In unmanned retail, products are numerous, visually similar, and frequently updated. Traditional object-detection approaches require retraining for every new product, leading to high maintenance costs.

Solution: Developed an image-retrieval-based product recognition algorithm that eliminates the need for retraining when new products are added.

Results: Won the National First Prize at the 2022 China Robot and Artificial Intelligence Competition, exhibited at Baidu Wave Summit 2021+, and accumulated 100,000+ views and 200+ GitHub Stars.

OXiaoPeng

Problem: Early in the LLM wave, multiple mainstream models were fragmented and hard for ordinary users and small communities to access.

Solution: Built a multi-model aggregation application integrating Baidu ERNIE, PanGu, Yuan1.0, ChatYuan, and ChatGPT, accessible through WeChat, Feishu, and QQ.

Results: Served as a unified access layer for community evaluation and feedback, reaching 2,000+ direct users and 20,000+ indirect users. The project was shortlisted for the 2023 MiraclePlus Spring Camp (S23) interview.

Publications & IP

Zhou Tao, Xue Dong, Yan Xin. LangChain Large Model Application Development: From Beginner to Practice. Tsinghua University Press.
D. Xue, J. Tu, M. Wang, X. Yan, F. Liu and J. Hu, “Towards Privacy-Preserving Mental Health Support with Large Language Models,” arXiv preprint arXiv:2601.01993, 2026.
F.-Q. Cui, J. Huang, S. Zhao, J.-M. Guo, Q. Cai, X. Yan and Z. Liu, “ReMA: A Training-Free Plug-and-Play Mixing Augmentation for Video Behavior Recognition,” arXiv preprint arXiv:2601.00311, 2026.
F.-Q. Cui, J. Huang, S. Zhao, X. Li, X. Yan, Z. Jia and X. Zhou, “Robust Low-Rank Sparse Framework for Video-Based Affective Computing,” 2025.
X. Yan, Q. Hu, X. Huang and C. Shen, “Intelligent Retail Settlement Platform based on Image Retrieval,” CISCE, 2022.
Utility Model Patent: Smart Retail Settlement Platform.
Software Copyrights: ColugoMum, Intelligent Waste Sorting System, Domain-Knowledge-Based Q&A System.

Awards & Honors

Featured Awards

First Prize, 2025 Synthetic Data Competition · Lingxi AI for Mental Health
Bronze Award, 2024 China International College Students’ Innovation Competition
Bronze Award, 2023 9th China International “Internet+” Innovation Competition (Lingxin Intelligence)
Silver Award, 2022 8th China International “Internet+” Innovation Competition (Xiaosheng Technology)
National First Prize, 2022 24th China Robot and Artificial Intelligence Competition

Selected Honors

Baidu PaddlePaddle Developer Expert (PPDE)
OpenAtom Foundation Active Open Source Contributor
Baidu AIStudio 2022 Top 10 Influential Figures
OpenI Community Core Early Experience Officer
Datawhale Member
Qwen Ambassador

Talks & Media

Invited Talks

2025.09: Speaker, Apsara Conference “Doing AI, the Gen-Z Way”
2025.05: Alibaba Cloud ModelScope Developer Co-Creation Event, “Open Source Technology Driving Pan-Psychological Services and AI Inclusion”
2023.06: PaddlePaddle LLM Application Development Course, “Building an Intelligent Document Query Assistant”
2023.05: Datawhale AIGC Learning Program, “Building a Local Knowledge Base QA Application with LangChain and ChatGLM-6B”
2022.04: PaddlePaddle Industry Practice Library, “Product Recognition Industry Application”

Media Coverage

Alibaba Cloud Tongyi Qwen: “Writing Code, Writing Emotions”
Founder Park: Interview on Qwen 3 and Cylingo Group
Alibaba Cloud Tongyi Qwen: “Tongyi Qwen + Mental Health = ?”
ScienceNet: “They Developed a Mental Health LLM with Tongyi Qwen”
Synced SOTA Models: Multiple new releases of MindChat and related projects

Community & Activities

PaddlePaddle Navigator Group | East China Regional Lead
2021.09 - 2022.07

Oversaw operations across 7 provinces in East China, expanding coverage to 50+ universities including Zhejiang University, Southeast University, ShanghaiTech, Nanjing University of Aeronautics and Astronautics, Soochow University, ECUST, and Hefei University of Technology.

ECUST PaddlePaddle Navigator Group | Lead
2021.04 - 2022.01

Built the university-level navigator group from scratch, organizing lectures, competitions, and hands-on projects, reaching 100+ participants and incubating 10+ high-quality projects.
Drove the implementation of the Baidu-ECUST industry-education collaboration course.

Education

East China University of Science and Technology | School of Information Science and Engineering | Automation
2019.09 - 2023.06

Contact

I am currently continuing to advance the Xinyuan model series while exploring multi-agent collaboration, embodied AI with affective interaction, and sustainable business models for domain-specific LLMs.

I am open to collaboration in:

Domain-specific LLMs for mental health, healthcare, education, and community services
Affective intelligence and multimodal interaction products
Open source community building and technical evangelism

Reach me at yx20001210@163.com or on GitHub.

← 中文

Xin Yan