Hi there

My Name's
Jack Lee Jian Ming

An ML Engineer | Data Scientist with 5+ years of experience in developing cutting-edge AI solutions in the Healthcare industry.

Portrait Photo

Expertise

How I Can Help You

Bringing innovative technology, strategic insights and industry best practices to solve your unique business problems

GenAI-powered Application

Developing advanced RAG systems, with commercial or open-source LLMs, for complex data validation, classification, and information retrieval.

End-to-End ML Pipeline

Designing comprehensive machine learning workflows from data processing to model deployment, ensuring robust and scalable solutions that drive actionable insights.

Data Pipeline Engineering

Constructing efficient data pipelines to seamlessly integrate, clean, and warehouse data from diverse sources with optimal performance.

API & Backend Development

Building high-performance RESTful APIs with asynchronous design and caching layer to optimize response times and system efficiency.

Performance Optimization

Leveraging advanced techniques such as Concurrency and Parallel Computing to accelerate I/O-bound and Compute-bound workflows.

Production-Ready Deployment

Containerizing applications with Docker for seamless, reproducible deployments across development and production environments.

Skills at a glance

What I Bring To The Table

Turning ideas into scalable solutions with a versatile tech stack

Explore My Skills
Machine Learning
LlamaIndex
LlamaIndex
ChatGPT
ChatGPT
Ollama
Ollama
MLflow
MLflow
BentoML
BentoML
PyTorch
NumPy
Transformers
Transformers
Ray
Ray
Data Engineering
dbt
dbt
Airflow
Apache Spark
PostgreSQL
MongoDB
Neo4j
Qdrant
Qdrant
Minio
Minio
Great Expectation
Great Expectation
Backend Development
Python
Git
VS Code
Poetry
Poetry
FastAPI
SQLModel
SQLModel
Docker
Docker
AWS
GCP

Professional Experience

ML Consultant
Freelance
March 2021 - Present

  • Provided strategic consulting on ML lifecycle management, data governance, and deployment strategies for clients.
  • Designed and delivered expert workshops on ML infrastructure, platform development, and operational best practices.

Data Scientist
AMILI Pte Ltd, Singapore
Feb 2022 - Nov 2023

  • Architected and developed core components of a knowledge graph platform delivering personalized healthcare insights using advanced ML techniques.
  • Engineered scalable MLOps infrastructure in collaboration with DevOps and Data Engineering teams.
  • Created and deployed efficient data pipelines to web-scrape, wrangle and manage heterogeneous healthcare/wellness data.
  • Implemented advanced NLP solutions using commercial and open-source LLMs (ChatGPT, Zephyr, Llama) for data validation and classification.
  • Led intern team in developing NLP and Computer Vision capabilities, driving platform innovation and knowledge transfer.

Data Scientist
NovaGlobal Pte Ltd, Singapore
March 2019 - Sep 2020

  • Developed end-to-end ML pipelines for advanced computer vision applications in medical imaging domain.
  • Collaborated with NVIDIA to integrate novel organ models into Clara ML Pipeline using RESTful API architectures.
  • Conducted technical workshops showcasing NVIDIA RAPIDS and Clara platform capabilities to prospective clients.
  • Partnered with Microsoft to create a bibliometric Power BI dashboard using Microsoft Academic Graph dataset.

Data Analyst
Alterquo Sdn Bhd, Malaysia
March 2018 - Feb 2019

  • Collaborated with NUHS researchers to develop ML-driven patient stratification model using XGBoost for cardiovascular disease prediction.
  • Performed comprehensive data cleaning, statistical analysis, and predictive modeling on anonymized healthcare patient datasets.

Education

Master of Science (MSc) in Bioinformatics
Perdana University, Malaysia
2018 - 2021

Bachelor of Science (BSc) in Biotechnology (Hons)
UCSI University, Malaysia
2015 - 2018

Two-Stage RAG App with Local LLM for Medication NER
Two-Stage RAG App with Local LLM for Medication NER

Utilizing Hybrid Search and Reranking with Llama 3.2 3B to accurately extract entities from unstructured medication text.


Medical
GenAI
NER
RAG
Dockerized App for Synthetic Patient Data Generation
Dockerized App for Synthetic Patient Data Generation

A dockerized application that generates synthetic healthcare data derived from real-world statistical distributions.


Medical
Data Synthesis
Docker