I had the pleasure of attending the 2019 AI Summit in San Francisco on September 25 and 26. At the conference, I was able to demo new AI software I am developing. The software is an American Sign Language translator. The goal is to produce an app that allows someone to point their phone camera at a person who is speaking in American Sign Language. The sign language is then transcribed onto the screen. Currently, the software can recognize and transcribe the ASL alphabet. In order to train the deep learning model, I need many, many images of people performing the ASL alphabet. As I explained the work and showed people the demo, I asked if they would let me take a video of their hand as they signed the letters. I had a total of 37 people who let me video them! I was tremendously grateful. The image labeling is being done using IBM Visual Insights (formerly Power AI Vision) software and the models are being trained using the POWER Systems optimized version of TensorFlow included in the Watson Machine Learning Community Edition software running on an IBM POWER9 AC922 server. I also had the opportunity to do an oral presentation 4 times in the booth to describe my project. The turnout for the presentations was good and people asked a lot of probing questions.
The ASL software project is challenging in that combines several machine learning/deep learning techniques that must work together both in concert and independently. The project uses convolutional neural networks (CNN) for object recognition, which is a subset of computer vision, as well as recurrent neural networks (LSTM), and natural language processing (NLP). At a high level, the software must recognize hands, arms, and facial expressions since all are involved in ASL. Upon recognition, it must determine what the hand and arm doing. Is it making a letter, a number, or some other symbol? When is the hand actually making a letter or signing a word and when it is transitioning? The transitions need to be discarded. Combining all of these into a workflow that can run on a mobile app is something that has not been done before but will make a huge impact for the deaf community.
There were several things I found interesting at the conference. Typically, at an AI conference, there is one technology that seems to be everywhere, such as autonomous vehicles or inferencing platforms. However, this conference was different in that no one technology was overrepresented. There were probably more companies that provided image labeling services than any others, but they were not overwhelming.
Overall, I thought the conference was well organized, had a nice breadth of vendors and technologies represented, and was quite productive. I will also be attending the AI Summit in New York in December where we will have an even bigger presence in the booth. Stop by and see the newest version of the ASL demo!
If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn, LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/, and follow my company, Mark III Systems, on Twitter @markiiisystems.
#artificialintelligence #ai #machinelearning #ml #lstm #ibm #powersystems #ac922 #neuralnetworks #RNN #recurrentneuralnetwork #tensorflow #watson #visualinsights #poweraivision #computervision #aisummit #convolutionalneuralnetwork #cnn #asl #americansignlanguage #deeplearning #objectrecognition #naturallanguageprocessing #nlp