Sprint Review: I love it when a plan comes together
Arvados component development is converging on an integrated platform.
Greetings from Curoverse HQ! We bring you tidings from the trenches of Arvados development, where we've just finished a really productive engineering sprint and are excited to see the product that's emerging.
For the last few months, each of our engineers has been working independently on different Arvados components. It's been an extremely productive period, but until now we haven't combined the different tools we've worked so hard to build. Now that we're putting those pieces together, it's incredibly satisfying and exciting to watch a unified, seamless application emerge.
Among the features that have come out of our May 7 engineering sprint, "Storing and Organizing Data":
We knuckled down and revamped the whole Workbench interface from end to end, resulting in one beautiful user experience with a sleek, clean, intuitive UI for managing your data and pipelines.
We know some of you have been really eager for a Java SDK, so we're delighted to announce the Java SDK for Arvados. Informaticians who work with Java can immediately write Arvados pipelines in that language.
A new event manager allows Arvados components to signal events to each other quickly and securely, reducing latency between back-end systems. This event bus allows us to make the system more responsive to critical system events like low disk space or compute node failures.
It's now possible to build and run a Crunch job in a Docker container, drastically improving pipeline provisioning and reproducibility. The benefits of deploying a job in a Docker container include:
- simpler provisioning: Build your Docker image once, then deploy it consistently across all your compute nodes. Make as many as you need to suit different analyses.
- reproducibility: Users now have a away to control the whole environment their analysis runs in, all the way down through the system’s C library.
We've also implemented a Data Manager tool to help site administrators keep tabs on the health of a local Arvados installation. The Data Manager will allow administrators to identify "garbage" data (blocks and collections that are no longer in use by any pipeline), control cache utilization and monitor user usage.
As always, if you're an Arvados user or curious about what we're doing, we'd love to hear from you. The engineering team coordinates on IRC, so if you're an IRC user, pop into #arvados on the OFTC network and say hi!