Feature #21494
closedGet Java and R SDKs out of the critical path of main branch builds
Description
The big idea: The Java and R SDKs are neither mature nor critical enough that it makes sense to hold them to the same build standards as the rest of Arvados. They should normally not block developer-run-tests or the other jobs that follow a merge to main. Instead we can have a separate Jenkins job to run when needed (like testing a change to one of these specific SDKs) and as part of the larger release pipeline.
Parts of the job:
- In
doc/Rakefile
, consider a way to specify which SDKs you do and don't want to build docs for. We want to build the Python SDK as part of developer-run-tests, and the R SDK as part of this new Jenkins job. It would be nice if there was a switch that accepted a list of known SDKs and built what you specified. - Write a Jenkins job that tests the Java SDK, R SDK, and doc linkchecker after building R documentation.
- Add this new job to multijobs and pipelines where needed, per above.
- Reorganize developer-run-tests to remove those tests from the existing jobs. (It might make sense to do a little larger organization as part of this.)
- Note that the doc publishing job (not the linkchecker test) should still build and publish all SDKs. Retaining that behavior is a requirement.
Updated by Brett Smith 11 months ago
- Related to Bug #21498: CRAN package download failures causing test failures added
Updated by Brett Smith 11 months ago
- Related to Bug #21321: R SDK install is flaky - stringi doesn't download completely added
Updated by Peter Amstutz 11 months ago
- Target version changed from Future to Development 2024-02-28 sprint
Updated by Brett Smith 11 months ago
21494-sdk-doc-build @ f3081db313f5b99ec40e41e279f6ce2bbf156fca - developer-run-tests: #4048
This adds a new environment variable to doc/Rakefile
that lets you specify which SDKs you want to build unconditionally. The new Java+R SDK jobs can say bundle exec rake generate sdks=java,r
. The regular developer-run-tests pipeline can say bundle exec rake generate sdks=python
. The full documentation build for doc.arvados.org can say bundle exec rake generate sdks=all
.
- All agreed upon points are implemented / addressed.
- Yes
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- This branch does not complete the story, but Jenkins reconfiguration happens outside the repository
- Code is tested and passing, both automated and manual, what manual testing was done is described
- See above, also tested different builds manually
- Documentation has been updated.
- Yes
- Behaves appropriately at the intended scale (describe intended scale).
- No change in scale
- Considered backwards and forwards compatibility issues between client and server.
doc/Rakefile
still supports the various previous ways of skipping SDK builds as long as you don't set the newsdks
variable
- Follows our coding standards and GUI style guidelines.
- Yes
Updated by Tom Clegg 11 months ago
This branch LGTM, thanks.
The Java and R SDKs [...] should normally not block [...] other jobs that follow a merge to main
Java/R SDKs will still be able to stop doc.arvados.org from updating, right? That seems like a harder problem to solve. Are we keeping that as a follow-up story, or just accepting the situation? I'm leaning toward the latter.
Updated by Brett Smith 11 months ago
Tom Clegg wrote in #note-6:
Java/R SDKs will still be able to stop doc.arvados.org from updating, right? That seems like a harder problem to solve. Are we keeping that as a follow-up story, or just accepting the situation? I'm leaning toward the latter.
Yes, the latter, see last bullet in the story. With this we at least get the benefit of getting the build out of developer-run-tests, so there are fewer random "why did tests fail?" incidents as a branch goes through review. Between doing it less often, and the R SDK improvements in #21321, while the doc site update still can fail hopefully it'll be at a reasonable level.
Updated by Brett Smith 11 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|3088d48fc13089013fc0e0fa252c3d5122fd073d.
Updated by Brett Smith 11 months ago
- Status changed from Resolved to In Progress
Propose the following Jenkins organization:
- developer-run-tests-sdk-python-ruby grows to build and linkcheck the PySDK documentation
- developer-run-tests-doc-and-sdk-R becomes developer-run-tests-sdk-java-R. It tests both SDKs, and builds and linkchecks their documentation. It gets removed from developer-run-tests. It gets added to the release checklist as one of the jobs that needs to run and pass before a release can go out.
- developer-run-tests-remainder just needs to get updated to match.
- The script that publishes to doc.arvados.org needs to be updated to run with
sdks=all
.
Updated by Brett Smith 11 months ago
21494-sdk-doc-linkchecker @ 810846168f8e14f63caefac534b843b7681b881f - developer-run-tests: #4052
Further testing revealed that in order for us to run linkchecker as planned in Jenkins, the linkchecker task needed some updates to match the SDK selection code.
- All agreed upon points are implemented / addressed.
- N/A? Hopefully it's clear how this is a bridge between the previous branch and the desired Jenkins changes discussed at standup.
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- This branch does not complete the story, but Jenkins reconfiguration happens outside the repository
- Code is tested and passing, both automated and manual, what manual testing was done is described
- See above, also successfully ran
test doc sdks=none
andtest doc sdks=all
locally.
- See above, also successfully ran
- Documentation has been updated.
- N/A, covered by previous branch
- Behaves appropriately at the intended scale (describe intended scale).
- No change in scale
- Considered backwards and forwards compatibility issues between client and server.
- As before, this is backwards-compatible with the previous "no-sdk" knobs
- Follows our coding standards and GUI style guidelines.
- Yes
Updated by Brett Smith 11 months ago
Brett Smith wrote in #note-10:
Propose the following Jenkins organization:
- developer-run-tests-sdk-python-ruby grows to build and linkcheck the PySDK documentation
- developer-run-tests-doc-and-sdk-R becomes developer-run-tests-sdk-java-R. It tests both SDKs, and builds and linkchecks their documentation. It gets removed from developer-run-tests. It gets added to the release checklist as one of the jobs that needs to run and pass before a release can go out.
- developer-run-tests-remainder just needs to get updated to match.
All the Jenkins parts of this are done (so, not the release checklist bits).
developer-run-tests-doc-sdk-java-R: #2222 (the long changelist is because I pasted the wrong commit hash previously, my bad)
I am very sorely tempted to reorganize developer-run-tests into just three jobs: the current -remainder, -workbench2, and then one that combines -api, -fuse, and -sdk-ruby-python. All three jobs would run in ~25 minutes, so we'd get a very even load with no increase in total runtime. This would free up Jenkins slots so that, e.g., two developer-run-tests jobs could run entirely in parallel. But before I do that, I wanted to at least note here that the planned changes are done and successful.
I saved copies of the original job scripts before I edited them, so we have those in case we need to do a quick revert or something.
Updated by Brett Smith 11 months ago
Hm, d-r-t-sdk-ruby-python is not actually running the Python tests. The command is in the console output and run-tests.sh seems to run afterwards but it doesn't seem to do anything. I'll look at it in the morning, it looks likely to be Python 2/3 cruft.
Updated by Brett Smith 11 months ago
Brett Smith wrote in #note-13:
I am very sorely tempted to reorganize developer-run-tests into just three jobs: the current -remainder, -workbench2, and then one that combines -api, -fuse, and -sdk-ruby-python. All three jobs would run in ~25 minutes, so we'd get a very even load with no increase in total runtime. This would free up Jenkins slots so that, e.g., two developer-run-tests jobs could run entirely in parallel.
developer-run-tests: #4055 - Note that I moved the sdk/ruby tests to -remainder because that was easier. They take 2sec so they're not a set we're trying to runtime-optimize.
I also updated developer-run-tests-doc-sdk-java-R to linkcheck the documentation with all the SDKs, not just Java and R. This provides a more thorough test before release, which seems helpful. I thought this would happen as part of a separate "update doc.arvados.org" Jenkins job but there is no such job.
Updated by Brett Smith 11 months ago
The script that publishes doc.arvados.org is updated. At this point all that's left is updating the release checklist.
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-02-28 sprint to Development 2024-03-13 sprint
Updated by Peter Amstutz 10 months ago
- Target version set to Development 2024-03-13 sprint
- Tracker changed from Idea to Feature
Updated by Peter Amstutz 10 months ago
Brett Smith wrote in #note-16:
The script that publishes doc.arvados.org is updated. At this point all that's left is updating the release checklist.
https://dev.arvados.org/projects/arvados/wiki/Release_Checklist
Note the meta-process section, there's a file called TASKS which is used to create Redmine issues automatically and needs to be kept in sync with the wiki page.
Updated by Brett Smith 10 months ago
Peter Amstutz wrote in #note-19:
https://dev.arvados.org/projects/arvados/wiki/Release_Checklist
What seems most straightforward to me is for everyone to be on the same page that step #3, "Ensure that the release staging branch passes automated tests on jenkins." includes the new Java+R SDK test. I'm happy to add that, but it seems like it would be even better to list all of the Jenkins jobs that are expected to pass on the release branch on this page. What are they?
If we make that change, is that sufficient by itself? There would be no immediate need to update the TASKS
file if we're just expanding the definition of that task.
The wiki could use a modernization pass in general. e.g., it still refers to the separate workbench2
repository. I'd be fine doing that with fresh eyes too if there's no objection.
Updated by Peter Amstutz 10 months ago
Brett Smith wrote in #note-20:
Peter Amstutz wrote in #note-19:
https://dev.arvados.org/projects/arvados/wiki/Release_Checklist
What seems most straightforward to me is for everyone to be on the same page that step #3, "Ensure that the release staging branch passes automated tests on jenkins." includes the new Java+R SDK test. I'm happy to add that, but it seems like it would be even better to list all of the Jenkins jobs that are expected to pass on the release branch on this page. What are they?
Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date -- and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.
However I think explicitly calling out the R SDK and Java SDK would make sense since they have been split off from the current "developer-run-tests".
If we make that change, is that sufficient by itself? There would be no immediate need to update the
TASKS
file if we're just expanding the definition of that task.
Sounds good.
The wiki could use a modernization pass in general. e.g., it still refers to the separate
workbench2
repository. I'd be fine doing that with fresh eyes too if there's no objection.
Please do.
Updated by Brett Smith 10 months ago
Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date
There's no value to being vague like this when like half the rows in the table already link to other specific Jenkins jobs. And yeah, half those links were broken. Documentation needs to be kept up-to-date.
and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.
Documented this. Again, there are already other tasks with similar splits.
Please do.
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-03-13 sprint to Development 2024-03-27 sprint
Updated by Peter Amstutz 10 months ago
Brett Smith wrote in #note-22:
Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date
There's no value to being vague like this when like half the rows in the table already link to other specific Jenkins jobs. And yeah, half those links were broken. Documentation needs to be kept up-to-date.
and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.
Documented this. Again, there are already other tasks with similar splits.
Please do.
LGTM.
The only typo I noticed is you deleted the footnote at the bottom of the page without deleting the reference to the footnote (1) on the steps.
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-03-27 sprint to Development 2024-04-10 sprint
Updated by Peter Amstutz 10 months ago
- Target version changed from Development 2024-04-10 sprint to Development 2024-03-27 sprint
Updated by Peter Amstutz 10 months ago
- Status changed from In Progress to Resolved