Ken Tsui

prof_pic.jpg

London, United Kingdom

I am a seasoned machine learning engineer with over a decade of experience in applied research and AI product development including document intelligence and generative media. I aspire to build reliable AI models that benefit humanity.

For the past six years, my expertise has centered on language models, computer vision (detection, text recognition), data curation, synthetic data generation, and distributed training. Prior to this, I specialized in machine learning and statistical modeling for structured data, with applications in credit scoring, stress testing, and anti-attrition modeling.

As an active open-source researcher, I regularly contribute to large language model and vision-language model pretraining and post-training datasets, and reasoning benchmarks.

My earlier career as an external auditor and qualified accountant informs my rigorous, systematic approach to model evaluation and testing, ensuring robust and reliable AI solutions.

My research interests:

  • pretraining and post training data curation
  • reasoning benchmark, in particular, self-correction and inductive reasoning
  • vision language model
  • world model

HuggingFace
Github

latest posts

selected publications

  1. blind_spot_summary_default_non_reasoning.png
    Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs
    Ken Tsui
    2025
  2. numseqbench_accuracy.png
    NumSeqBench: Benchmarking Inductive Reasoning in Language Models via Number Sequences
    Ken Tsui
    2025