On August 19 and 20, I attended the OpenPOWER Summit in San Diego, CA. This was my first time to attend the conference, so I was a little unsure of what to expect. My main purpose in attending was to host a speaking session with Linton Ward, Distinguished Engineer for IBM, and Chris Sullivan, Assistant Director of the Center for Genome Research and Biocomputing at Oregon State University, where we spoke about a group we formed earlier this year in conjunction with Charlie Foretich from Tech Data, called the “OpenPOWER Solution Builder Community” (OSBC).
The OSBC is a group of researchers, university faculty, members of industry, value added resellers, systems integrators, and business partners who use and/or sell IBM POWER Systems, particularly the IBM POWER9 AC922. Our overarching goals are to foster collaboration, share experiences, and give feedback. As one of the speakers at the conference said, “All of us is better than one of us.”
Currently, there is not much out there about what people are doing with POWER Systems, how they are doing it, what challenges they face, or how they have overcome those challenges. We plan to change that. In the presentation, Linton provided an overview of the OSBC, Chris talked about it from the hardware perspective, and I discussed the AI side of things, particularly using IBM Watson Machine Learning Community Edition and IBM Visual Insights (previously PowerAI Vision).
The presentation was well-received, and many attendees of the conference were excited about the OSBC. This is something that is needed and will provide a significant benefit to the community. To learn more about the community you can visit the OSBC website. You do not have to be a member of the Open Power Foundation to read the forum, but you do have to be a member to post.
Chris Sullivan has put together demos that compare the performance of the IBM POWER9 AC922 versus several different Intel-based servers. It is remarkable and worth a 10-minute look. Chris has also put together a couple of rich GitHub repository with scripts and tools specific to the AC922.
At the conference, a major announcement was made. IBM is moving the OpenPOWER Foundation to The Linux Foundation and open sourcing the POWER Instruction Set Architecture (ISA). Tech Crunch is just one of the many outlets that gives more details of the announcement found here. I think this will have a positive impact overall because it will allow chip manufacturers, such as FPGA manufacturer, Xilink, to innovate at a faster speed and create products that have a larger impact.
Finally, I was interviewed for a podcast about the OSBC with Linton and Chris by Luke Schatz (@lukeschantz) from IBM. It was a great experience and will post the podcast once it is published. You can also check out my Twitter feed (@pacejohn) to see other highlights from the conference.
If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn, LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/, and follow my company, Mark III Systems, on Twitter @markiiisystems.
#artificialintelligence #ai #machinelearning #ml #ibm #powersystems #openpower #linux #openpowerfoundation #linuxfoundation #techdata #oregonstateuniversity #AC922 #watsonmachinelearning #watson #OSBC #fpga
Over the past few weeks, I have been doing some benchmark testing between the IBM POWER9 AC922 server and the Nvidia DGX-1 server using time series data. The AC922 is IBM's Power processor-based server that is optimized for machine and deep learning. Nvidia's DGX-1 is Nvidia's Intel processor-based server that is optimized for machine and deep learning. Both servers have the latest Nvidia V100 GPU. The AC922 has 2x 16GB V100 GPUs, the DGX-1 has 8x 32GB. This post summarizes the general process used in the benchmarking as well as the results. I have intentionally kept the post somewhat conceptual to illustrate our methodology and key discoveries at a high level.
I chose time series data for the benchmark testing for 2 reasons. First, it is something our customers are asking about. Second, time series problems can be investigated using both classical machine learning algorithms, like ARIMA, as well as by deep learning using recurrent neural networks, particularly LSTMs. ARIMA runs solely on the CPU whereas training for LSTMs takes place on a GPU. Thus, this testing allowed comparisons of both CPU- and GPU-based processes on both servers. In addition, it allowed me to compare the relative quality of the predictions made by two very distinct techniques.
In this work, I used synthetic data generated using software by Jinyu Xie called Simglucose v0.2.1, "a Type-1 Diabetes simulator implemented in Python for Reinforcement Learning purpose." Predicting blood glucose levels for patients with diabetes is of particular importance because the models can be used in insulin pumps to predict the correct amount of insulin to give to normalize blood glucose levels. The software generates synthetic blood glucose data for children, adolescents, and adults at designated time points. I used data at 1 hour time points. The training data consisted of 2,160 time points (90 days). The prediction data consisted of 336 time points (14 days). So, the models were trained using the 90 days of hourly time points, then tried to predict the hourly values for the next 14 days.
Two different algorithms were used to make the predictions. The first was a classical machine learning technique known as ARIMA (Autoregressive Integrated Moving Average), more specifically, SARIMA (Seasonal Autoregressive Integrated Moving Average). SARIMA is a specialized version of ARIMA that takes into account seasonality of the data (blood glucose levels have a very clear seasonal pattern). The second was an implementation of a recurrent neural network known as an LSTM (Long Short-Term Memory). To perform the comparisons, I used Jupyter notebooks that ran Python scripts. Thanks to Jason Brownlee (@TeachTheMachine) of Machine Learning Mastery for his publicly available code that was adapted for this project. I compared the predictions for the SARIMA and LSTMs using Mean Squared Error as the quantitative metric of how well the model made predictions. I also calculated the time it took for the training and predictions to be done. For SARIMA, I only calculated the total run time for the training/prediction calculations since there is no distinct prediction phase. For the LSTM, I calculated separate run times for the training and prediction phases since they are distinct processes.
For training and predictions, I used synthetic data for adolescents that I termed as Patients 001, 002, 003, 004, and 005. For each patient, the 2,160 hourly training data points and 336 prediction data points were generated. I then did pairwise comparisons of all patients. For example, I would train a model using the training data for patient 001. I would then use that model to make predictions for patients 001, 002, 003, 004, and 005. The process was repeated for all patients. This gave a total of 25 comparisons. Full pairwise comparisons were performed to evaluate if either SARIMA or LSTM could create models that could be generalized to other patients.
Not surprisingly, the models could not be generalized, which was the expected result. This result underscores the importance of using relevant data to train machine learning models.
Below are examples of how the models performed on predictions. The blue lines are the actual values and the red lines are the predicted values. In the first 2 images, the model was trained on the data for Patient 002. Predictions were then made for Patient 002. In the last 2 images, the model was again trained on the data for Patient 002. Predictions were then made for Patient 001. As you can see, the results were significantly less accurate than when values for Patient 002 were predicted.
Figure 1: Predictions for Blood Glucose Levels for Patient 002. The top image shows the predictions using SARIMA. The bottom image shows predictions with the LSTM. Since the predictions are made using the data from the same patient, the predictions should be quite accurate, as is shown in the graphs.
Figure 2: Predictions for Blood Glucose Levels for Patient 001. The top image shows the predictions using SARIMA. The bottom image shows predictions with the LSTM. Since the predictions are made using the data for a different patient (Patient 002), the predictions should be less accurate, as is shown in the graphs.
The results I obtained were very interesting. Here is a very high-level overview.
So, the burning question is, "Which server is better for time series analysis? The Nvidia DGX-1 or the IBM POWER9 AC922?" The answer is both. For SARIMA and LSTM training, the DGX-1 outperformed the AC922. For LSTM predictions, the AC922 outperformed the DGX-1. Which one should you choose? The answer depends on the use case. If you want to use SARIMA or if LSTM training time is your driving factor, the DGX-1 may be the better choice. If prediction speed is critical, the AC922 could be better.
Admittedly, there are a couple of caveats that must be considered. There were some version differences of pandas, NumPy, TensorFlow, and Keras due to the differing processor architectures. I tried multiple versions of each and the results were very similar. Also, the DGX-1 uses Docker containers while the AC922 uses conda environments. This could lead to some differences as well. Overall, I think these differences have very little effect on the overall benchmark outcomes, but it is something I plan to investigate further. Finally, the models were trained on a small data set, only 2,160 data points. I will be trying a much larger data set in the future as well as trying different combinations of hyperparameters to improve forecast accuracy.
The Jupyter notebooks, scripts, and data files, along with all of the summary statistics are available on my GitHub page (https://github.com/pacejohn/Glucose-Level-Prediction).
If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn, LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/, and follow my company, Mark III Systems, on Twitter @markiiisystems
#artificialintelligence #ai #machinelearning #ml #timeseries #lstm #arima #sarima #nvidia #dgx #ibm #powersystems #ac922 #docker #conda #neuralnetworks #RNN #recurrentneuralnetwork #diabetes #pandas #numpy #tensorflow #keras
Special thanks to Jason Brownlee and Jinyu Xie!
Brownlee, Jason. 11 Classical Time Series Forecasting Methods in Python (Cheat Sheet).
Brownlee, Jason. How to Develop LSTM Models for Time Series Forecasting.