I had the pleasure of presenting the state-of-the-art NLP model, Google BERT, at the Toronto Deep Learning Series (TDLS) meetup. This video has 50000+ views as of Feb 2020, and is the top viewed lecture of TDLS (now called AISC).
In 2014, I entered the University of Toronto as an undergrad with a burning passion for physics. In 2018, I left the academic world to start a career in industry machine learning.
This is how I transitioned from academia to industry.
What is UofT like? Is it hard? Is it as depressing as people say it is? And what is POSt?
As a recent Bachelor of Science graduate, I answer these and more in this guide for new students.
HackOn(Data), Toronto’s very own data hackathon in the heart of downtown, is back for 2017!
At HackOn(Data) last year, I learned a lot, had lots of fun, and made industry connections that landed me and my teammate great summer internships (my blog post). This year I plan on volunteering for HackOn(Data) 2017.
I highly recommend HackOn(Data). Register at hackondata.com/2017!
Over the past school year (2016-2017), I have been participating in kaggle competitions with the University of Toronto Data Science Team (UDST).
We have participated in competitions such as the Outbrain Click Prediction, DSTL Satellite Imagery Feature Detection and Data Science Bowl 2017. I have learned a lot from my participation in UDST. In fact, it was these competitions that led me to write my spark-Jupyter-AWS guide, and the posts on multi-cpu data processing and s3 data access with boto3.
If you are a UofT student or simply a data enthusiast in the Toronto area, come check us out! We will be continuing activities in the summer of 2017.
When the University of Toronto Data Science Team participated in Data Science Bowl 2017, we had to preprocess a large dataset (~150GB, compressed) of lung CT images. I was tasked with the following:
For S3 I/O on python, see my other post. In order to analyze the data efficiently, I used the python package
multiprocessing to maximize CPU usage on an AWS compute instance. The result: Multi-CPU processing on a c4.2xlarge was 6 times faster than ordinary pre-processing on my local computer.