Thanks very much for reporting this issue, and for making the pull request. I'm really glad you're engaged with Arvados at such a deep technical level. I wanted to take this opportunity to elaborate on some of the design background that motivated the current code, and what improvements might be most helpful from here.
Arvados expects Crunch scripts to exist in a crunch_scripts/
subdirectory inside the Git repository. The primary motivation for this is to make it easy for Crunch scripts to be added to existing repositories. For example, this is how the Crunch scripts shipped with Arvados itself work: the run-command script lives in crunch_scripts/run-command
in our own repository. This is covered somewhat as you go through the user guide. If you go through the tutorial about writing your own Crunch script, it instructs you to create a crunch_scripts
directory in your repository, and write the script there.
Because so many existing Git repositories already follow this layout, we would break a lot of pipelines if we simply stopped searching for Crunch scripts there. I understand that the documentation could be clearer about this layout, and we already have a bug filed about that, #6027. I've also just filed #7621, as a feature enhancement that could help users debug this issue more easily. You're absolutely right that there's plenty of room for improvement here. We're very open to other suggestions for ways we might make this whole process more user-friendly. However, in order to preserve backward compatibility, those changes will need to be a little more nuanced than to stop looking under crunch_scrips/
.
I hope this helps explain why we couldn't accept the pull request as written verbatim. If anything's unclear, or if you have any follow-up questions, please don't hesitate to ask. Thanks again for the contributions.