This is the first semester since the Fall of 2015 that I have not taught a course with the Foundation for Advanced Education in the Sciences. It was a pleasure teaching and I was lucky enough to spend most of my time on a course I had designed. For the Spring 2016 semester I designed the syllabus and began teaching a course on machine learning and object oriented python. I chose to include a web application as I felt it exposed the students to some unfamiliar ideas.
Most of the students were fellow scientists. Many only had previous experience writing scripts for use in their own research. Not trusting user input was often a novel concept. During the course I only had a couple of hours to introduce web applications. This meant I skipped over many important topics. I intend this post to be the first in a collection moving beyond the basics for anyone still new to these concepts. I will start with a basic background but the actual implementation will hopefully be new for most. If implementing web application authentication is familiar to you then skip ahead to the implementation.
Let me know if there are topics you think I should cover moving forward.
In this post I will cover authentication, specifically adding a second authentication factor for additional security.
At the NIH Pi Day celebration I gave a lightning talk on applying deep learning to histology images. A video of the event is now available at NIH Videocast.
During the one hour event there were presentations on ten different projects. I was the second speaker and began at 8:48.
I am using deep learning to identify glomeruli in kidney biopsies. When we are unsure about the specific type of kidney disease a patient has we take a small biopsy to look at the kidney. It is often differences in the glomeruli that define the type of disease. Pathologists study the biopsy to define the type of kidney disease. These skilled pathologists spend significant time locating the glomeruli. A machine can do this simple step. The pathologist can then focus on the harder disease identification task.
The theme for the Transportation Techies event this month was Capital Bikeshare. This is the bike sharing service in Washington DC. Information is available on every trip and every station. Lots of analyses are possible with all this data. This event was the seventh on this theme.
I had not worked with geographical or transportation data before this so I learned a lot. I treated the stations as cities in the traveling salesperson problem. I then calculated the shortest path visiting all the stations.
I was able to do this using open data and open source software. This included customizing the calculation of distances for cycling.
The slides I presented include links to all the data and software used. The code I wrote is available on github. I include a Dockerfile for running the routing software with data for the Washington DC region.
Deep neural networks are typically too slow to train on CPUs. Instead, GPUs are used. The example in the notebook uses a relatively small network so should be runnable on any hardware.