Whilst working on my previous post on two factor authentication I was reminded of the broad spectrum of approaches taken to security by many sites. I have one bank that does two factor authentication, another using the standard username/password combination for one factor authentication, and then another that asks for a username, a piece of personal information and then a partial password. In this case, three characters from a password.
Of these three approaches it is the last that seems the weakest. The piece of personal information includes things like the town you grew up in, the name of your first school, etc. These are not generally secret and could be discovered for most people by a bit of research. The partial password scheme seems only slightly stronger than a three character password.
I believe the partial password is intended to prevent a keylogger on your computer from compromising your entire password. This scheme is often paired with selecting characters from a dropdown menu, potentially providing additional protection. The idea is that by requesting different characters on each visit you would need to log in multiple times on a compromised computer before an attacker discovered your entire password.
I don't think such a system would be used in a new application today but I did wonder how such a scheme might be implemented.
This is the first semester since the Fall of 2015 that I have not taught a course with the Foundation for Advanced Education in the Sciences. It was a pleasure teaching and I was lucky enough to spend most of my time on a course I had designed. For the Spring 2016 semester I designed the syllabus and began teaching a course on machine learning and object oriented python. I chose to include a web application as I felt it exposed the students to some unfamiliar ideas.
Most of the students were fellow scientists. Many only had previous experience writing scripts for use in their own research. Not trusting user input was often a novel concept. During the course I only had a couple of hours to introduce web applications. This meant I skipped over many important topics. I intend this post to be the first in a collection moving beyond the basics for anyone still new to these concepts. I will start with a basic background but the actual implementation will hopefully be new for most. If implementing web application authentication is familiar to you then skip ahead to the implementation.
Let me know if there are topics you think I should cover moving forward.
In this post I will cover authentication, specifically adding a second authentication factor for additional security.
I'm just back from the UK where I spent a couple weeks catching up with family and friends. My visit happened to conincide with one of the monthly PyData London events so I attended and gave a lightning talk on image segmentation in medical applications.
They have built a really vibrant community and it was great meeting over 200 data enthusiasts.
At the NIH Pi Day celebration I gave a lightning talk on applying deep learning to histology images. A video of the event is now available at NIH Videocast.
During the one hour event there were presentations on ten different projects. I was the second speaker and began at 8:48.
I am using deep learning to identify glomeruli in kidney biopsies. When we are unsure about the specific type of kidney disease a patient has we take a small biopsy to look at the kidney. It is often differences in the glomeruli that define the type of disease. Pathologists study the biopsy to define the type of kidney disease. These skilled pathologists spend significant time locating the glomeruli. A machine can do this simple step. The pathologist can then focus on the harder disease identification task.
The theme for the Transportation Techies event this month was Capital Bikeshare. This is the bike sharing service in Washington DC. Information is available on every trip and every station. Lots of analyses are possible with all this data. This event was the seventh on this theme.
I had not worked with geographical or transportation data before this so I learned a lot. I treated the stations as cities in the traveling salesperson problem. I then calculated the shortest path visiting all the stations.
I was able to do this using open data and open source software. This included customizing the calculation of distances for cycling.
The slides I presented include links to all the data and software used. The code I wrote is available on github. I include a Dockerfile for running the routing software with data for the Washington DC region.