Home About
About
Cancel

About

Biography

I currently live in Ontario, Canada. I have graduated from King’s College London with a master’s degree in Artificial Intelligence. Before that, I graduated from the University of Birmingham with First Honor Degree.

My professional experience bridges advanced AI research and software product development. At Shengqu Game, I led the creation of a 20B-parameter code-completion language model, integrating PageAttention for low-latency inference and achieving a 3× speedup in token generation. I designed distributed training pipelines using DeepSpeed and Accelerate on multi-GPU clusters, and developed a Visual Studio Code extension in TypeScript for real-time AI-assisted coding. At BOC International, I implemented transformer-based market forecasting models, optimized financial data pipelines in Python and SQL, and built React dashboards for real-time market trend visualization in high-frequency trading environments.

My academic research focuses on Natural Language Generation and Few-Shot Learning. I used to work in Covid Corpus PhD group led by Dr Mohammed Bahja at the University of Birmingham. My responsibility is to design and implement Natural Language Processing modules, including Topic Modeling, Text Summarization and Article Recommendation.

Visualization Demo

After that, I mainly focus on Natural Language Generation tasks. During my master’s degree, I was working on Few-Shot Learning for Table-to-Text Generation. The model makes state-of-the-art results using improvement techniques such as memory storage and prototype instance selection.

Here is my Resume.

Software & Technical Highlights

  • Full-Stack Development: Proficient in Python, Java, C++, JavaScript, SQL, React, and Django for building dynamic, data-driven web applications.
  • API Design & Deployment: Experienced with RESTful API development using FastAPI, Django, and Docker for scalable and secure backend services.
  • High-Performance Computing: Skilled in distributed model training on multi-GPU systems (A100) with DeepSpeed and Accelerate.
  • Data Engineering: Expertise in database optimization, large-scale data preprocessing, and real-time visualization systems.
  • Dev Tools & Integration: Created custom IDE extensions (VS Code) for seamless AI model integration into developer workflows.
  • Machine Learning Frameworks: Advanced usage of PyTorch, Transformers, Optuna, and NLP evaluation metrics (BLEU, ROUGE).

Work Experience

Machine Learning Engineer Internship at Shengqu Games

06.2023 – 12.2023

In my latest tech endeavor during my internship, I focused on harnessing the power of large language models (LLMs) to refine the software development lifecycle. The goal was to leverage the company’s existing API documentation to train an LLM that could assist developers by providing real-time coding suggestions and conversational guidance directly related to the company’s APIs.

Technical Highlights:

  • Large-Scale LLM Development: Designed and trained a 20B-parameter code-completion model using PyTorch, optimized with DeepSpeed ZeRO and Accelerate on a 4×A100 GPU cluster.
  • Low-Latency Inference: Integrated PageAttention to reduce token generation latency by 3×, enhancing real-time user experience in development environments.
  • API Engineering: Built a production-grade RESTful API backend with FastAPI on Docker, supporting real-time AI inference requests for code completion and instructional chat.
  • Developer Tool Integration: Created a Visual Studio Code extension in TypeScript for GitHub Copilot–style in-editor AI assistance, with direct access to internal APIs and documentation.
  • Custom Evaluation Framework: Developed a tailored benchmarking system for code-completion tasks to measure model accuracy, response speed, and developer satisfaction metrics.
  • Open-Source Contribution: Released the AI-assisted coding platform as DevAssistant, demonstrating reproducible training pipelines and inference workflows. Project: https://github.com/Miraclove/DevAssistant

More Detail


Quantitative Machine Learning Internship at BOC International

01.2023 – 06.2023

During my time as a Quantitative Machine Learning Intern in the bustling financial hub of Shanghai, China, within a dynamic Fintech Department from April to July 2023, I embarked on a deep-dive into the fusion of advanced machine learning and high-frequency trading (HFT).

Technical Highlights:

  • Financial Time-Series Modeling: Adapted Transformer, GPT, and BERT architectures for market forecasting, enabling predictive modeling in high-frequency trading (HFT) scenarios.
  • Data Engineering Pipelines: Designed Python (Pandas) and SQL workflows to preprocess and transform large-scale tick-by-tick market datasets, extracting actionable trading signals.
  • Production Model Deployment: Integrated machine learning models into low-latency trading execution systems, collaborating with software engineers to ensure fault tolerance and real-time monitoring.
  • Hyperparameter Optimization: Applied Optuna and grid search to optimize model parameters, improving forecasting stability under volatile market conditions.
  • Real-Time Visualization Tools: Built a React + JavaScript dashboard to visualize live financial trends, enhancing traders’ decision-making speed and accuracy.
  • Backtesting & Validation: Conducted rigorous historical simulations to validate model robustness, improving predictive accuracy by 15% compared to baseline. More Detail

Research Experience

Few-shot Learning for Text Generation

10.2021 – 06.2022

At the prestigious King’s College London, from January 2022 to January 2023, I was engrossed in cutting-edge research centered on Few-shot Learning, a subdomain of machine learning focusing on the capability of models to learn from a limited amount of data. My role as a researcher allowed me to dive into the challenges of text generation—a field where the articulation of coherent and contextually relevant narratives from structured data, like tables, is the prime objective.

More Detail


Software Developer at Covid Corpus

10.2020 – 06.2021

In a project that delved into the wealth of COVID-related academic literature, I had the unique opportunity to apply my machine learning skills to make sense of complex datasets. The project, a part of my final year under the guidance of my project supervisor, was a cornerstone in my exploration of natural language processing (NLP) and its applications in the real world.

Visit Visualization Demo

More Detail

Research Interests

Natural Language Processing, Natural Language Generation, Chat Bot, AI Art.

Personal Skills

Python, Java, C++, Html, Javascript, SQL

Tensorflow, Pytorch, Scikit-learn, Pandas, Docker etc.

Research

Natural Language Processing

Recommendation and Article Summary Generation for the COVID Corpus Website with the Ph.D. team using Topic Modelling (LDA), FNN, etc.

Few-shot Learning for Text Generation using T5, BART, BERT.

Computer Vision

Used to carry out Medical Image Segmentation on MRI Images by using UNet, CNN, etc and achieve a state of the art results.

Used to carry out Image classification on MINST datasets and achieve 99% at the top of the benchmark.

Software

Experience in using Docker and Docker Compose for Mirco services in Web servers.

Experience in Django and React for web development, with Django as the back end and React as the front end.

Experience in Java Spring boot and MyBatis for web development.

Experience in Java LWJGL, JavaFX, Network Programming, and MySQL for online multiplayer game development.

Experience in Java Swing, MySQL, and Network Programming to develop instant message applications like Skype.

Experience in carrying out DevOps with Agile software development with the team, from finishing doc, and designing databases to implementing client-side and server-side software every loop.

Language

Native Speaker of Chinese, Fluent Speaker in English.