Bug #6862

[Documentation] What do you need to know about using Git with Arvados

Added by Bryan Cosca about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Documentation
Target version:
Start date:
08/05/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

Provide a rough guide on how git works and how to work with your own repository.

  • Create new arvados Git repository.
  • Create a new branch (provide naming conventions)
  • Create a new file in a new branch
  • Git add
  • Git commit
  • Git push origin branch_name (Don't need to do origin all the time, explain remote vs local)
  • When making changes to original scripts, only commit and push.
  • Change script_version and script in pipeline template.

Subtasks

Task #6889: Review 6862-git-doc-guide-newResolvedRadhika Chippada

Associated revisions

Revision 8dccf76e
Added by Bryan Cosca about 4 years ago

closes #6862
Merge branch '6862-git-doc-guide-new'

History

#1 Updated by Brett Smith about 4 years ago

  • Category set to Documentation
  • Target version set to 2015-08-19 sprint

Bryan Cosca wrote:

  • Create new arvados Git repository.

This page already exists as "Adding a new arvados repository," under "Working with Arvados Repositories." The rest of this can be added as one or more pages to that section.

However, if we're going to talk about integration with pipeline templates, it might make sense to have a page titled something like "Working with Arvados repositories" under "Develop a new pipeline." The current Adding page could be a start of that. It might take a little tweaking to make sure the flow of the existing tutorial isn't broken from this change.

  • Create a new branch (provide naming conventions)

It might make sense to have Tom weigh in on this. Now that Arvados lets users create their own repositories, it might make sense to let people come up with their own branching plans, and just expect that they'll start out working on master directly, and create separate repositories for separate scripts.

  • Create a new file in a new branch
  • Git add
  • Git commit
  • Git push origin branch_name (Don't need to do origin all the time, explain remote vs local)
  • When making changes to original scripts, only commit and push.

?? You have to add all changes before you commit them, even if you're editing files that already exist in the repository. Are you maybe using git commit -a? That's a shortcut for running git add on all the changes in the file before committing them.

#2 Updated by Bryan Cosca about 4 years ago

However, if we're going to talk about integration with pipeline templates, it might make sense to have a page titled something like "Working with Arvados repositories" under "Develop a new pipeline." The current Adding page could be a start of that. It might take a little tweaking to make sure the flow of the existing tutorial isn't broken from this change.

I agree with this. But, I just noticed while going though the Develop a new pipeline documentation, I saw that all these commands already exist here: http://doc.arvados.org/user/tutorials/tutorial-submit-job.html, maybe it needs to be more easily available? What if people are looking for a one off documentation question, rather than reading sequentially through the entire "Develop a new pipeline" page. Maybe something like #6866.

?? You have to add all changes before you commit them, even if you're editing files that already exist in the repository. Are you maybe using git commit -a? That's a shortcut for running git add on all the changes in the file before committing them.

Yes, I've been using git commit -a. I did not realize that is what -a did. I guess I'm still learning what git commands are actually doing rather than "I know this works so I'm going to keep doing this."

#3 Updated by Bryan Cosca about 4 years ago

  • Assigned To set to Bryan Cosca
  • Story points set to 1.0

Since we already have "Working with Arvados repositories" section, I will add this as another page after "Adding a new arvados respository". I think the title should be something like "Guide to using Git with Arvados".

#4 Updated by Bryan Cosca about 4 years ago

  • Assigned To deleted (Bryan Cosca)

#5 Updated by Bryan Cosca about 4 years ago

  • Assigned To set to Radhika Chippada

#6 Updated by Bryan Cosca about 4 years ago

  • Assigned To changed from Radhika Chippada to Bryan Cosca

#7 Updated by Bryan Cosca about 4 years ago

  • Status changed from New to In Progress

#8 Updated by Bryan Cosca about 4 years ago

  • Description updated (diff)

Branch 6862-git-doc-guide is up for review.

I tried to keep it very simple. This is for bioinformaticians who have never heard of git before and want to try using it. I want it to be like a git reference guide without overwhelming the new user.

If there is anything else you think I should add, feel free to comment :)

#9 Updated by Radhika Chippada about 4 years ago

Looking at branch 6862-git-doc-guide, at commit ba85c6f5e9454e7fe3c40dc744345cb399d45f0c

Bryan: It appears that you accidentally included your work for some other issue here (probably for #6600)

git diff master...6862-git-doc-guide is listing more than doc updates.

Can you please correct this and let me know when ready for review. You can either rollback those other file updates or create a new branch with just doc updates. Thanks.

#10 Updated by Radhika Chippada about 4 years ago

Also, initial review comments for the newly added "Working with Arvados git repository" page.

  • Title: please replace “your” in “Working with Arvados git repository” with “an” so that it reads “Working with an Arvados git repository”. Also, please check the case of “git repository”. Other titles do not capitalize all words.
  • Can you please rename (the previous section) as “Adding a new Arvados repository” (upper case Arvados)?
  • Can you please replace all “your Arvados repo…” with “an Arvados repo” ?
  • Liquid error: No such template ‘tutorial_git_repo_expectations’ => do you need to add this to git? Not sure what this include says, but I think it should include “you are logged into a shell VM stuff” and also a note saying that this tutorial uses the “tutorial” git repository …
  • Note, a new directory named tutorial will be created => bold tutorial? (tutorial)
  • Can you merge these two into one sentence please “You may get the following warning” and “Do not worry about this, cloning an empty repository is not a problem.” => “Ignore any warning about cloning an empty repository …”
  • git clone :$USER/tutorial.git => I think we are trying to encourage users to use their https URLs. That said: it appears that this requires a lot more information here. I think you need to move the entire “Clone arvados repository” section from “Running on an Arvados cluster” into this page and cross reference this in the "Running on an Arvados cluster" page
  • “git checkout -b tutorial_branch” => This needs to precede with a “cd tutorial” to ensure the user is in the cloned repository directory. Also, can you please make this a section “Create a git branch”. This would go well with “Clone arvados repository” section …
  • Please add a section “add a script to git” or something like that to cover “git add tutorial.txt” and “git commit -a” and “git push …” I think it would be similar to “Creating a Crunch script” in “Running on an Arvados cluster”. BTW, this particular section in “Running on an Arvados cluster” has a typo. It says “cd $USER”. It should say “cd tutorial”. Can you please fix it while you are at it. Thanks.
  • “Now you’re ready to use this script as part of an Arvados job” => since the example uses “tutorial.txt”, this sentence saying “script” does not seem appropriate. Do you want to use a script example instead (similar to “Running on an Arvados cluster”)

#11 Updated by Bryan Cosca about 4 years ago

Branch 6862-git-doc-guide-new commit 39c9193f03471aa7826769b34d6b55890a2c98a3

Radhika Chippada wrote:

Also, initial review comments for the newly added "Working with Arvados git repository" page.

  • Title: please replace “your” in “Working with Arvados git repository” with “an” so that it reads “Working with an Arvados git repository”. Also, please check the case of “git repository”. Other titles do not capitalize all words.

Done

  • Can you please rename (the previous section) as “Adding a new Arvados repository” (upper case Arvados)?

Done

  • Can you please replace all “your Arvados repo…” with “an Arvados repo” ?

Done

  • Liquid error: No such template ‘tutorial_git_repo_expectations’ => do you need to add this to git? Not sure what this include says, but I think it should include “you are logged into a shell VM stuff” and also a note saying that this tutorial uses the “tutorial” git repository …

Added

  • Note, a new directory named tutorial will be created => bold tutorial? (tutorial)

Done

  • Can you merge these two into one sentence please “You may get the following warning” and “Do not worry about this, cloning an empty repository is not a problem.” => “Ignore any warning about cloning an empty repository …”

Removed the warning and added a sentence.

  • git clone :$USER/tutorial.git => I think we are trying to encourage users to use their https URLs. That said: it appears that this requires a lot more information here. I think you need to move the entire “Clone arvados repository” section from “Running on an Arvados cluster” into this page and cross reference this in the "Running on an Arvados cluster" page

I tried using the https URL, but ran into https://arvados.org/issues/6263, that being said, I can change the url to https if that is preferred, if we think that issue will be fixed within the next sprint.

I did everything else too.

  • “git checkout -b tutorial_branch” => This needs to precede with a “cd tutorial” to ensure the user is in the cloned repository directory. Also, can you please make this a section “Create a git branch”. This would go well with “Clone arvados repository” section …

Done. I would add it to the Clone arvados repository section but it would require a lot more changes than just adding a section becuase I would have to change the whole flow of pushing to the branch instead of master.

  • Please add a section “add a script to git” or something like that to cover “git add tutorial.txt” and “git commit -a” and “git push …” I think it would be similar to “Creating a Crunch script” in “Running on an Arvados cluster”. BTW, this particular section in “Running on an Arvados cluster” has a typo. It says “cd $USER”. It should say “cd tutorial”. Can you please fix it while you are at it. Thanks.

Done

  • “Now you’re ready to use this script as part of an Arvados job” => since the example uses “tutorial.txt”, this sentence saying “script” does not seem appropriate. Do you want to use a script example instead (similar to “Running on an Arvados cluster”)

I think having a section like Running on an Arvados cluster shows that you can do it with a script example. I think this section should be for general script additions, like people's shell scripts or random files. I'll modify the last sentence to be like that.

#12 Updated by Radhika Chippada about 4 years ago

  • Working with an Arvados git repository is analogous to working with other public repositories. If you are already familiar with git, feel free to skip this part of the documentation.
    I don’t think we want to say “feel free to skip this part” because now it contains lot more information than just doing a git clone (finding the tutorial repository etc). Also, I think you can merge this and next sentence as something like: “This tutorial describes how to work with a new Arvados git repository. Working with an Arvados git repository is analogous to working with other public git repositories. It will show you how to upload custom scripts to a remote Arvados repository, so that you can use them in Arvados pipelines.”
  • The note “you can follow the ... page” => can you please say “you call follow the instructions in the … page”?
  • Title: Can you also please add “git” in the previous page title “Adding a new Arvados repository”? I think both page titles in left nav using the same wording makes it more readable.
  • “Cloning an Arvados branch” => Now we have this section in two places and it can be confusing for a user who is going though the docs in the order. It also added maintianence overhead. Can you please remove this section from the “Running on an Arvados cluster” and replace it with something like: Clone arvados repository (section title) followed by “please clone the tutorial repository using the instructions from ... page … ”
  • Also, can you copy the Note “for more information about using git” into this new page. I think it would actually be nice to have it at the top of the page even before the “Clone” section (please see how it looks). If it seems good, I think it makes sense to move it up in the “Running on an Arvados cluster” page as well.
  • “Creating a git branch” => “Creating a git branch in an Arvados repository” ?
  • “Adding a script to git” => “Adding files or scripts to an Arvados repository” ? (since you are showing a text file as an example)
  • “First create a new file in the local repo” => “Create a file named tutorial.txt in the local repository” ?
  • “Next, commit all the changes to the local repository, along with a message of what you've accomplished” => “Next, commit all the changes to the local repository, along with a commit message that describes what this script does” ?
  • “Although this tutorial showed how to add a text file to Arvados, this tutorial should also show the necessary steps for adding your custom bash, R, or python scripts to an Arvados repository” => This is a bit confusing. Can you say something like “Although this tutorial describes how to add a text file to an Arvados repository, these same instructions can be used to add crunch scripts that can be used to run your pipelines”?

#13 Updated by Bryan Cosca about 4 years ago

On Commit 406b3de5426bf0d63564410cf6caf2834ba2b7bb

Radhika Chippada wrote:

  • Working with an Arvados git repository is analogous to working with other public repositories. If you are already familiar with git, feel free to skip this part of the documentation.
    I don’t think we want to say “feel free to skip this part” because now it contains lot more information than just doing a git clone (finding the tutorial repository etc). Also, I think you can merge this and next sentence as something like: “This tutorial describes how to work with a new Arvados git repository. Working with an Arvados git repository is analogous to working with other public git repositories. It will show you how to upload custom scripts to a remote Arvados repository, so that you can use them in Arvados pipelines.”

Done

  • The note “you can follow the ... page” => can you please say “you call follow the instructions in the … page”?

Done

  • Title: Can you also please add “git” in the previous page title “Adding a new Arvados repository”? I think both page titles in left nav using the same wording makes it more readable.

Done

  • “Cloning an Arvados branch” => Now we have this section in two places and it can be confusing for a user who is going though the docs in the order. It also added maintianence overhead. Can you please remove this section from the “Running on an Arvados cluster” and replace it with something like: Clone arvados repository (section title) followed by “please clone the tutorial repository using the instructions from ... page … ”

Done

  • Also, can you copy the Note “for more information about using git” into this new page. I think it would actually be nice to have it at the top of the page even before the “Clone” section (please see how it looks). If it seems good, I think it makes sense to move it up in the “Running on an Arvados cluster” page as well.

I don't think it looks good to put it up top on the "Running on an Arvados cluster" page. I did remove it from the Clone arvados repository section, so if you want it on that page, let me know where to put it.

  • “Creating a git branch” => “Creating a git branch in an Arvados repository” ?

Done

  • “Adding a script to git” => “Adding files or scripts to an Arvados repository” ? (since you are showing a text file as an example)

Done

  • “First create a new file in the local repo” => “Create a file named tutorial.txt in the local repository” ?

Done

  • “Next, commit all the changes to the local repository, along with a message of what you've accomplished” => “Next, commit all the changes to the local repository, along with a commit message that describes what this script does” ?

Done

  • “Although this tutorial showed how to add a text file to Arvados, this tutorial should also show the necessary steps for adding your custom bash, R, or python scripts to an Arvados repository” => This is a bit confusing. Can you say something like “Although this tutorial describes how to add a text file to an Arvados repository, these same instructions can be used to add crunch scripts that can be used to run your pipelines”?

With run-command, its not necessary that these need to be crunch scripts, these could be any custom R/java/perl/python script. I don't want customers to have to convert their script to a crunch script.

#14 Updated by Radhika Chippada about 4 years ago

  • Please add a sentence along the lines “Create a git a repository named tutuorial-branch in the tutorial Arvados git repository” below the section title “Creating a git branch in an Arvados repository.
  • Regarding the closing statement on the page, it is still confusing “this tutorial should also show the necessary steps”. It feels like the tutorial “should” have listed something but is missing. I was just trying to see if we can rephrase it something like “Although this tutorial showed how to add a text file to Arvados, the same steps can be used to add any of your custom bash, R, or python scripts to an Arvados repository.” It can even be moved to the top of the section right after "create a file named tutorial.txt in the local repo" to make it clear right off the bat.
  • In the page “Running on an Arvados cluster”: “Please clone the tutorial repository using the instructions from Working with Arvados git repository” => Can you highlight the word tutorial? Also, if it looks good, please consider adding , “, if you have not yet cloned already” because it is possible that the user already did it as he went through the docs in sequence

LGTM after your consideration of these suggestions. Thanks.

#15 Updated by Bryan Cosca about 4 years ago

Radhika Chippada wrote:

  • Please add a sentence along the lines “Create a git a repository named tutuorial-branch in the tutorial Arvados git repository” below the section title “Creating a git branch in an Arvados repository.

Done

  • Regarding the closing statement on the page, it is still confusing “this tutorial should also show the necessary steps”. It feels like the tutorial “should” have listed something but is missing. I was just trying to see if we can rephrase it something like “Although this tutorial showed how to add a text file to Arvados, the same steps can be used to add any of your custom bash, R, or python scripts to an Arvados repository.” It can even be moved to the top of the section right after "create a file named tutorial.txt in the local repo" to make it clear right off the bat.

I removed the last sentence and added your suggested sentence to after "create a file..."

  • In the page “Running on an Arvados cluster”: “Please clone the tutorial repository using the instructions from Working with Arvados git repository” => Can you highlight the word tutorial? Also, if it looks good, please consider adding , “, if you have not yet cloned already” because it is possible that the user already did it as he went through the docs in sequence

Done

LGTM after your consideration of these suggestions. Thanks.

Thanks!

#16 Updated by Bryan Cosca about 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:8dccf76ea830057d433b07c40bc2c1294891ea39.

Also available in: Atom PDF