On October 14 and 15, I attended the annual Rice University Data Science Conference. There were a several things I took from the opening days sessions.
The keynote speaker, Dr. David A. Jaffary, was from M.D. Anderson Cancer Center, a hospital that is world renown for cancer research and treatment. A large portion of his talk was about data governance and who actually owns a patient’s medical data. Is it the patient, the hospital, the doctor, the insurance company, a combination of these, or none of these? He displayed a map of the United States that showed how this varies from state to state. What was clear is that there is no consistency. This question has to be answered before real analytic work can be done on patient data. This is an absolutely critical problem since data science can improve patient care as well as operational efficiency and profitability of medical institutions, just to name a few. In most instances, only hospital staff have access to data, yet it is sitting unused. I asked Dr. Jaffary why medical institutions are not making major investments in data science like companies such as Walmart, Wells Fargo, and Uber are making. He stated it is all about outcomes.
The second speaker, Kjersti Engan, Professor of Electrical Engineering and Computer Science, University of Stavanger, Norway, discussed image analysis of tissue samples. She mentioned that images must be labeled by trained pathologists so there is a ground truth to compare the deep learning model predictions to. Unfortunately, this problem is a catch-22. Pathologists do not have enough time to label images since it is time and labor intensive. Without labeled images, deep learning cannot be used to find areas of interest on the slides. If the images were labeled, the pathologists would have more time work on more complicated cases that deep learning cannot currently solve. That’s where the catch-22 comes in. There is no time to label images, yet labeled images will save time ultimately.
Another hot topic was explainability of deep learning models and how physicians are reticent to adopt AI when no one can tell them why it made the predictions it did. This is where the trade-off between classical machine learning and deep learning comes in. Maybe the solution to this problem is to use classical machine learning techniques, such as random forests or support vector machines for specific problems, rather than neural networks. In contrast to neural networks, classical machine learning techniques produce models that are explainable. Maybe it is better to start analysis using classical machine learning rather than deep learning and only implement deep learning models when the results are significantly better.
Finally, a speaker discussed the carbon footprint generated by servers that are executing deep learning jobs using GPUs. I had not this considered before, but it is a problem that must be addressed since the use of deep learning is becoming more prevalent.
On Day 2, I attended several sessions on data science in healthcare. One session, presented by Dr. Luca Giancardo, Assistant Professor at the Center for Precision Health, School of Biomedical Informatics (SBMI) at the University of Texas Health Science Center at Houston, stood out. It was about using quantitative measurements to determine the progression of Parkinson’s Disease in patients. Currently, the progression is measured be a neurologist looking at a patient’s movements. At a later time, the neurologist has the patient perform the same movements and makes a qualitative decision about the progression. This is highly subjective. One group is using 2 different methods to measure the progression. One involves hold time for keys on a keyboard. As the disease progresses, there is a marked and statistically significant difference in hold time on keys. They are also using metrics from swipes on mobile devices. As with hold time, swipe dynamics change over time. The pressure applied, the length of the swipe, how straight it is – all of these change. By using these measurements, physicians are able to get clear numerical values that show the progression.
Day 2 was also when the posters were presented. Kate Weeks, a student in the University of the Pacific’s Master’s degree in Data Science program, presented her work that she has done on pothole detection using accelerometer data with San Joaquin County in their road maintenance department. I have had the privilege of working with her on the project. Great job, Kate! Update: Kate became a Data Analyst at Educational Results Partnership in December 2019 after graduation. Congratulations!
If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn, LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/, and follow my company, Mark III Systems, on Twitter @markiiisystems
#rice #datascience #ai #artificialintelligence #mdanderson #machinelearning #aiinhealthcare #healthcare #cancer #deeplearning #neuralnetwork #randomforest #gpu #supportvectormachine #svm #universityofthepacific #uop