John Pace
  • Home
  • About
  • Contact

​​
​
Data Scientist, husband, father of 3 great daughters, 5x Ironman triathlon finisher, just a normal guy who spent a lot of time in school.
Let’s explore data science, artificial intelligence, machine learning, and other topics together.

Is it really an egg roll?

10/24/2019

0 Comments

 
Picture

Yesterday, I taught a computer vision workshop.  In the section about image classification, I had the attendees train a deep learning model that could correctly classify images of the letters A, B, C, E, and H.  To build the model, they started with another previously trained model.  This is known as "transfer learning."  More specifically, transfer learning is a method where a model developed for a task is reused as the starting point for a model on a second task.  It is an optimization that allows rapid progress or improved performance when modeling the second task.
​
Picture
Examples of some of the letters used to train the models

One of the models that could be used as a starting point had been trained on images of Chinese food.  My colleague, Joe Jurak, Senior Storage Infrastructure Architect at Mark III Systems, asked if his newly trained model for classifying letters could correctly classify images of Chinese food as well, even though it had not included any pictures of Chinese food in the data set. He was in fact starting with a model that had been trained on images of Chinese food so it should be able to identify Chinese food.  Logically, it makes sense, right?  I explained that the pre-trained model of Chinese food he started with was simply the weights and biases that were produced during that initial training and did not include any of the classes - the names of each type of food.  By starting with those weights and biases, training the letter classification model could be sped up and optimized.

To test what I had said, Joe did the following experiment:
​
  1. Downloaded 5 pictures of egg rolls and labeled them as egg rolls.  There were already 100 pictures of each of the letters in the data set.
  2. Retrained his model to now include 6 classes:  A, B, C, E, H, and egg roll.
  3. Performed inference using a picture of an egg roll.

What Joe found was that when performing the inference using a previously unseen image of an egg roll, the image was correctly classified as an egg roll with high confidence.  Since there were so few images of egg rolls in comparison with the letters, didn't that mean some knowledge of the shape of an egg roll was being conveyed from the pre-trained model?

I had Joe conduct the following experiment to test his hypothesis.  He found a picture of a hot dog, which looks similar to an egg roll, and performed inference on it using his newly trained model.  It was classified as an egg roll with high confidence.  It made sense that it would be classified as an egg roll since it has a similar shape, but the confidence level of the prediction was very high.  It should not have been.  Was it possible that the hot dog was classified as an egg roll because it was the only thing in his data set that looked similar to a hot dog?  He then tried using an image of a dog, something that did not look similar to an egg roll.  To his surprise, it classified the dog as an egg roll with high confidence.  Looking at the heat map that showed which part of the image the model used to determine that it was an egg roll, he discovered that it was the dog's tail that was causing the classification.  Clearly the dog should not have been classified as an egg roll, rather it should not have been classified as anything.  He tried another picture of dog, one without a long tail, and it was not classified as anything.  Thus, the conclusion was that the hot dog and dog with a long tail were being classified as an egg roll because they looked more like an egg roll than a letter, not because the model was using previous information about the characteristics of egg rolls.

This simple experiment underscores a very important principle in data science.  A model can predict something with very high confidence, but be completely incorrect.  Thus, high confidence during inference does not always mean that the model is good.    Conversely, low confidence does not always mean the model is bad.  We may just need more training data or to modify hyperparameters.  We must always go back and do sanity checks on our results instead of just taking it for granted that our model is good or bad.

If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn and LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/

#artificialintelligence #ai #machinelearning #ml #convolutionalneuralnetwork #cnn #computervision #imageclassification #deeplearning #transferlearning #datascience

Thanks to Jason Brownlee for the nice definitions of transfer learning. (https://machinelearningmastery.com/transfer-learning-for-deep-learning/)
0 Comments

Rice Data Science Conference 2019

10/15/2019

0 Comments

 
Picture
Picture

On October 14 and 15, I attended the annual Rice University Data Science Conference.  There were a several things I took from the opening days sessions.

Picture

The keynote speaker, Dr. David A. Jaffary, was from M.D. Anderson Cancer Center, a hospital that is world renown for cancer research and treatment.  A large portion of his talk was about data governance and who actually owns a patient’s medical data.  Is it the patient, the hospital, the doctor, the insurance company, a combination of these, or none of these?  He displayed a map of the United States that showed how this varies from state to state.  What was clear is that there is no consistency.  This question has to be answered before real analytic work can be done on patient data.  This is an absolutely critical problem since data science can improve patient care as well as operational efficiency and profitability of medical institutions, just to name a few.  In most instances, only hospital staff have access to data, yet it is sitting unused.  I asked Dr. Jaffary why medical institutions are not making major investments in data science like companies such as Walmart, Wells Fargo, and Uber are making.  He stated it is all about outcomes.

Picture
Dr. David A. Jaffray, Chief Technology and Digital Officer, Professor, Departments of Radiation Physics and Imaging Physics, MD Anderson Cancer Center
PictureDr. Kjersti Engan, Professor of Electrical Engineering and Computer Science, University of Stavanger, Norway

The second speaker, Kjersti Engan, Professor of Electrical Engineering and Computer Science, University of Stavanger, Norway, discussed image analysis of tissue samples.  She mentioned that images must be labeled by trained pathologists so there is a ground truth to compare the deep learning model predictions to.  Unfortunately, this problem is a catch-22.  Pathologists do not have enough time to label images since it is time and labor intensive.  Without labeled images, deep learning cannot be used to find areas of interest on the slides.  If the images were labeled, the pathologists would have more time work on more complicated cases that deep learning cannot currently solve.  That’s where the catch-22 comes in.  There is no time to label images, yet labeled images will save time ultimately.


Another hot topic was explainability of deep learning models and how physicians are reticent to adopt AI when no one can tell them why it made the predictions it did.  This is where the trade-off between classical machine learning and deep learning comes in.  Maybe the solution to this problem is to use classical machine learning techniques, such as random forests or support vector machines for specific problems, rather than neural networks.  In contrast to neural networks, classical machine learning techniques produce models that are explainable.  Maybe it is better to start analysis using classical machine learning rather than deep learning and only implement deep learning models when the results are significantly better.

Finally, a speaker discussed the carbon footprint generated by servers that are executing deep learning jobs using GPUs.  I had not this considered before, but it is a problem that must be addressed since the use of deep learning is becoming more prevalent.
PictureDr. Luca Giancardo

On Day 2, I attended several sessions on data science in healthcare.  One session, presented by Dr. Luca Giancardo, Assistant Professor at the Center for Precision Health, School of Biomedical Informatics (SBMI) at the University of Texas Health Science Center at Houston, stood out. It was about using quantitative measurements to determine the progression of Parkinson’s Disease in patients.  Currently, the progression is measured be a neurologist looking at a patient’s movements.  At a later time, the neurologist has the patient perform the same movements and makes a qualitative decision about the progression.  This is highly subjective.  One group is using 2 different methods to measure the progression.  One involves hold time for keys on a keyboard.  As the disease progresses, there is a marked and statistically significant difference in hold time on keys.  They are also using metrics from swipes on mobile devices.  As with hold time, swipe dynamics change over time.  The pressure applied, the length of the swipe, how straight it is – all of these change.  By using these measurements, physicians are able to get clear numerical values that show the progression.

PictureKate Weeks, Master's Degree Student, University of the Pacific, (Data Analyst, Educational Results Partnership since December 2019)

Day 2 was also when the posters were presented.  Kate Weeks, a student in the University of the Pacific’s Master’s degree in Data Science program, presented her work that she has done on pothole detection using accelerometer data with San Joaquin County in their road maintenance department.  I have had the privilege of working with her on the project. Great job, Kate! Update:  Kate became a Data Analyst at Educational Results Partnership in December 2019 after graduation.  Congratulations!

Picture

If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn and LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/

#rice #datascience #ai #artificialintelligence #mdanderson #machinelearning #aiinhealthcare #healthcare #cancer #deeplearning #neuralnetwork #randomforest #gpu #supportvectormachine #svm #universityofthepacific #uop
0 Comments

    Archives

    December 2020
    November 2020
    September 2020
    August 2020
    June 2020
    May 2020
    April 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    August 2019
    May 2019
    April 2019
    March 2019
    April 2018
    March 2018
    January 2018
    November 2017

    Tweets by pacejohn
Proudly powered by Weebly
  • Home
  • About
  • Contact