Danny Luo Machine Learning Engineer


Residential Real Estate Valuation

Extracted, cleaned and pre-processed over 13 million records from remote SQL database. Trained XGboost valuation model on AWS EC2.

Guide: Spark with Jupyter on AWS

A guide on how to set up Spark with Jupyter on AWS EC2 instances with S3 I/O support. Presented at Toronto Apache Spark #19.

HackOn(Data) 3rd Place Project: Optimal Digital Map Placement in Toronto

A solution for determining the most optimal placement of location-based information maps throughout Toronto.

University of Toronto Data Science Team

The following is work I have done with the University of Toronto Data Science Team (UDST).

Data Science Bowl Preprocessing on AWS

I use python multiprocessing to preprocess Lung CT Images efficiently on all available CPU cores on AWS compute instances.

Dstl Satellite Image Exploration on AWS

An exploration of satellite images using AWS S3 and boto3 for the kaggle DSTL Satellite Imagery Feature Detection challenge.


Data Science Resources

A list of useful data science resources.

Book List

A list of what I’ve read.