On Monday and Tuesday, 4/15 and 4/16, I had the opportunity to participate in a hackathon at the Bio-IT World conference in Boston. The experience was amazing. The hackathon was put on the NCBI hackathon team of Ben Busby (@DCGenomics), Kaitlyn Barago (@KaitlynMBarago), and Allissa Dillman (@DCHackathons). We have worked with this team at other hackathons that Mark III Systems has sponsored. F – Data is Findable A – Data is Accesible I – Data is Interoperable R – Data is Re-usable My team’s topic was “BLAST, Pipelines, and FAIR”. If you are not familiar with BLAST, it is a free software product, developed by NCBI, that allows you to search for DNA or protein sequences in a pre-made or custom databases. BLAST was by far the software package I used more than anything in my graduate work. I am a huge fan. Our project was to create a re-usable pipeline that could be used to automate a bioinformatic pipeline so it could be run by anyone in a standard environment, such as Linux. The only thing that would have to be changed is the input files that are used. For more detail on the actual pipeline and our final presentation, see our team’s Github page. The presentation is in the Slides folder. Our pipeline was developed using CWL (Common Workflow Language). CWL is an open source framework for creating workflow pipelines. All configuration information, such as data file paths, are stored in YAML files. As with CWL, YAML files are widely used as configuration files (for example, in Hadoop). In the end, we had 1 very nice CWL file that ran the entire pipeline and 3 YAML configuration files. The CWL workflow was run using only one command from the command line. Overall, it was a great experience. I met some great people from varied backgrounds and enjoyed the camaraderie and cooperation between all of the teams. I can’t wait for next year’s hackathon! Big thanks to all of my team mates! Amanda Ruby, Software Engineer/Bioinformatics Analyst at Rheonix, Inc. @AmandaRubyBio Tom Madden, Team Lead for BLAST at the NCBI. @tom6931 Alexander Jung, Head of Digitalization Biologicals Development CMC at Boehringer Ingelheim Matt Doherty, Founder at Resolute.ai, @ResoluteAI Jody Burks, Developer Advocate, Quantum Computing Ambassador IBM, @JodyBurksPhD If you have questions and want to connect, you can message me on LinkedIn or Twitter. Also, follow me on Twitter @pacejohn, LinkedIn https://www.linkedin.com/in/john-pace-phd-20b87070/, and follow my company, Mark III Systems, on Twitter @markiiisystems #hackathon #ncbi #blast #artificialintelligence #ai #machinelearning #cwl #yaml #bioitworld #bioinformatics #github
0 Comments
Leave a Reply. |