This post gives summaries and personal commentaries of some, but not all, of the talks I heard on Day 1 of the Deep Learning in Healthcare conference hosted by Rework on May 23-24, 2019, in Boston. They are not listed in the order in which the sessions were delivered.
Some of the major overarching topics discussed were NLP in medical records, development of algorithms, and recurrent neural networks.
Panel Discussion: The Impacts of Machine Learning in Mental Health Care – Akane Sano, Rice University, Jordan Smoller, Psychiatrist, Harvard Medical School and Massachusetts General Hospital
This may have been the highlight of the day. I’ll just mention some of the major points Dr. Smoller brought up and discussed.
My second question was about if they were able to use socioeconomic data in conjunction with the medical record. He said in some cases they were, but that it would be tremendously helpful if it could be used.
Learning How the Genome Folds in 3D – Neva Duran, Aiden Lab, Baylor College of Medicine
I was particularly excited about this talk because the Aiden Lab is world renown in the field of biology. This lab invented the Hi-C sequencing method among other things. Dr. Duran discussed how they were trying to determine the location of enhancers and promoters along chromosomes. These can be difficult to find since they may be located great distances from each other yet work in concert. She described this as “spooky action at a distance.” I like that term. When enhancers and promoters interact, they bind together and form a looped DNA structure. By using deep learning to create “contact maps” of regions along the chromosomes, they were able to locate potential enhancers and promoters. This is something I want to investigate further.
Applying AI in Early Clinical Development of New Drugs – James Cai, Head of Data Science, Roche Innovation Center New York
This was the first talk that made me think of new use cases that could be developed for our customers. Dr. Cai discussed a phone app they developed that can identify standing, sitting, walking, and other movements. This is being used to give quantitative measurements of the activity of Parkinson’s Disease patients to see how their disease is progressing over time. He said this is critical because self-reporting of symptoms by patients is often inconsistent, vague, and lacks details. Next, he discussed how they are using smart watches to measure movement in schizophrenia patients. His hypothesis is that you can measure patient motivation and the levels and duration of depressive events by analyzing the amount of time patients are laying down or sleeping versus standing or moving. This can help physicians monitor the effectiveness of the medications the patient is taking without having to rely solely on self-reported data.
The “Why” Behind Barriers to Better Health Behaviors – Ella Fejer from the UK Science & Innovation Network
Dr. Fejer mentioned that our AI healthcare projects should “Be Inspirational.” I found this this statement to be challenging. So often we seem focused on demonstrating that something can be done. What if we focus on not just proving that it can be done, but can be done in a way that is “inspirational” to the clients and patients. What does “inspirational” mean? I think, in this context, it means that it should get people excited about what can be accomplished and how it can impact their lives for the better.
Application of a Deep Learning Model for Clinical Optimization and Population Health Management – Janos Perge, Principal Data Scientist, Clinical Products Team, CVS
Dr. Perge said that CVS owns Aetna, one of the largest insurance companies in the US. This gives them access to a wealth of medical data and records. I did not know CVS owned Aetna. His team has worked on 2 real-world applications. The first is to develop a low back surgery model to predict the risk of future back surgery on three time intervals. The second is to develop kidney failure models to predict the future risk of a member having chronic kidney failure and needing dialysis on different time intervals. Both projects have the potential to greatly lower costs since they can help physicians develop preventative treatment plans, thus avoiding major medical issues in the future. They developed hybrid models that consisted of LSTMs that utilized sequential ICD9 codes and a logistic layer to incorporate static features. One drawback of neural networks is the inability to tell why or how a model gives the results it does. This lack of being able to explain the results makes deep learning appear as a “Black Box.” To get around this, they added an Attention Layer near the top of their model that signaled relevant events. Dr. Perge did not go into much detail on this and I wish he had. They also added a Logistic Layer at the top of the model to incorporate static features from domain knowledge. They found that by using deep learning with transfer learning which included both sequential features and all static features, they were able to achieve the highest AUC.
How Chick-fil-A uses AI to Spot Food Safety Trends in Social Media – Davis Addy, Sr. Principal IT Leader, Food Safety & Product Quality
This was the only talk I attended that was not in the Healthcare track. I was interested in the talk because I wanted to see a real-world application of social media analysis. Their goal is simple: Use social media, such as Twitter and Yelp, to help spot potential food safety related issues. Their entire pipeline is run on AWS. They purchase social media data from a 3rd party company. This is analyzed using Amazon’s Comprehend sentiment analysis software. There are a couple of intermediate Python scripts they have written in-house. The results appear on a dashboard that both the restaurant and the corporate office can see. If a comment indicates a problem, the restaurant can respond as necessary. He discussed in detail the challenges of sentiment analysis and the steps they have had to do to make it more accurate. For example, take the word “sick.” One review might say “I ate a chicken sandwich and it made me sick.” This should be a negative sentiment. However, a teenager may write “Chick-fil-A makes a sick chicken sandwich.” In this case, the reviewer is using “sick” in a positive manner and this should get a positive sentiment. This was just one example he gave. There was a significant list of challenges they had to face. He said all of their code is being put on GitHub (github.com/chick-fil-a). I thought that was a nice touch.
Natural Language Processing for Healthcare – Amir Tahmasebi, Director of Machine Learning and AI, Codametrix
This was the first of many talks on natural language processing (NLP) in healthcare. Dr. Tahmasebi did a very good job of explaining some of the major challenges of using NLP with health records. These include: data size, data source/format, data structure, longitudinal data, clinical text and language complexity, language ambiguity, experience-driven domain knowledge and practice. Each of these is a hurdle that must be overcome in order to perform effective NLP. When discussing their deep learning strategy, he mentioned 2 things I had never heard of before. The first was ELMo, or Embeddings from Language Models, and the second was BERT, or Bidirectional Encoder Representations from Transformers. I’m not an NLP expert by any means so I’m going to spend some time looking into them. Thankfully he cited the papers that initially described these.
Deep Learning for the Assessment of Knee Pain – Vijaya Kolachalama, Assistant Professor, Section of Computational Biomedicine, Department of Medicine, Boston University School of Medicine
Dr. Kolachalama’s talk was particularly fascinating because he attempted to demonstrate how the location and severity pain, which is subjective, can actually be pinpointed by using deep learning. In his work, he looked at knee pain. His deep learning model used a convolutional Siamese network which uses the same set of weights while working in tandem on two different input vectors to compute comparable output vectors, outputting a heatmap that showed the predicted are of pain. During the QA, Dr. Tahmasebi (see summary above), asked if the network wasn’t just finding areas of abnormality in the knee. Dr. Kolachalama said this was the case, but that regions of abnormality are typically the source of pain so the model helped discover both the location of pain and any abnormalities. I asked if they had tried to apply the model to referred pain, such as when reported pain in the knee is actually caused by a pinched sciatic nerve in the hip. In this case, an area of abnormality would not be found in the knee. He said this is something they are looking into. I think there is potential there because finding no abnormality in the knee could direct the physician to look for other sources of the pain.
Distributed Tensorflow – Scaling Your Model Training – Neil Tenenholtz, Director of Machine Learning at the MGH and BWH Center for Clinical Data Science
This was a nice technical talk on how to employ multiple GPUs both within servers as well as across servers for both data parallel and model parallel training. He mentioned Horovod which allows GPUs from multiple servers to be used via MPI. I have recently spent some time studying Horovod and this helped augment my knowledge.