Project

General

Profile

Actions

Feature #21494

closed

Get Java and R SDKs out of the critical path of main branch builds

Added by Brett Smith 11 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Deployment
Story points:
-
Release:
Release relationship:
Auto

Description

The big idea: The Java and R SDKs are neither mature nor critical enough that it makes sense to hold them to the same build standards as the rest of Arvados. They should normally not block developer-run-tests or the other jobs that follow a merge to main. Instead we can have a separate Jenkins job to run when needed (like testing a change to one of these specific SDKs) and as part of the larger release pipeline.

Parts of the job:

  • In doc/Rakefile, consider a way to specify which SDKs you do and don't want to build docs for. We want to build the Python SDK as part of developer-run-tests, and the R SDK as part of this new Jenkins job. It would be nice if there was a switch that accepted a list of known SDKs and built what you specified.
  • Write a Jenkins job that tests the Java SDK, R SDK, and doc linkchecker after building R documentation.
  • Add this new job to multijobs and pipelines where needed, per above.
  • Reorganize developer-run-tests to remove those tests from the existing jobs. (It might make sense to do a little larger organization as part of this.)
  • Note that the doc publishing job (not the linkchecker test) should still build and publish all SDKs. Retaining that behavior is a requirement.

Subtasks 3 (0 open3 closed)

Task #21519: Review 21494-sdk-doc-buildResolvedBrett Smith02/21/2024Actions
Task #21538: Review 21494-sdk-doc-linkcheckerResolvedBrett Smith02/22/2024Actions
Task #21574: Review Release Checklist revision 70ResolvedPeter Amstutz03/18/2024Actions

Related issues 2 (0 open2 closed)

Related to Arvados - Bug #21498: CRAN package download failures causing test failuresDuplicateActions
Related to Arvados - Bug #21321: R SDK install is flaky - stringi doesn't download completelyResolvedBrett SmithActions
Actions #1

Updated by Brett Smith 11 months ago

  • Related to Bug #21498: CRAN package download failures causing test failures added
Actions #2

Updated by Brett Smith 11 months ago

  • Related to Bug #21321: R SDK install is flaky - stringi doesn't download completely added
Actions #3

Updated by Peter Amstutz 11 months ago

  • Target version changed from Future to Development 2024-02-28 sprint
Actions #4

Updated by Peter Amstutz 11 months ago

  • Assigned To set to Brett Smith
Actions #5

Updated by Brett Smith 11 months ago

21494-sdk-doc-build @ f3081db313f5b99ec40e41e279f6ce2bbf156fca - developer-run-tests: #4048

This adds a new environment variable to doc/Rakefile that lets you specify which SDKs you want to build unconditionally. The new Java+R SDK jobs can say bundle exec rake generate sdks=java,r. The regular developer-run-tests pipeline can say bundle exec rake generate sdks=python. The full documentation build for doc.arvados.org can say bundle exec rake generate sdks=all.

  • All agreed upon points are implemented / addressed.
    • Yes
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • This branch does not complete the story, but Jenkins reconfiguration happens outside the repository
  • Code is tested and passing, both automated and manual, what manual testing was done is described
    • See above, also tested different builds manually
  • Documentation has been updated.
    • Yes
  • Behaves appropriately at the intended scale (describe intended scale).
    • No change in scale
  • Considered backwards and forwards compatibility issues between client and server.
    • doc/Rakefile still supports the various previous ways of skipping SDK builds as long as you don't set the new sdks variable
  • Follows our coding standards and GUI style guidelines.
    • Yes
Actions #6

Updated by Tom Clegg 11 months ago

This branch LGTM, thanks.

The Java and R SDKs [...] should normally not block [...] other jobs that follow a merge to main

Java/R SDKs will still be able to stop doc.arvados.org from updating, right? That seems like a harder problem to solve. Are we keeping that as a follow-up story, or just accepting the situation? I'm leaning toward the latter.

Actions #7

Updated by Tom Clegg 11 months ago

  • Status changed from New to In Progress
Actions #8

Updated by Brett Smith 11 months ago

Tom Clegg wrote in #note-6:

Java/R SDKs will still be able to stop doc.arvados.org from updating, right? That seems like a harder problem to solve. Are we keeping that as a follow-up story, or just accepting the situation? I'm leaning toward the latter.

Yes, the latter, see last bullet in the story. With this we at least get the benefit of getting the build out of developer-run-tests, so there are fewer random "why did tests fail?" incidents as a branch goes through review. Between doing it less often, and the R SDK improvements in #21321, while the doc site update still can fail hopefully it'll be at a reasonable level.

Actions #9

Updated by Brett Smith 11 months ago

  • Status changed from In Progress to Resolved
Actions #10

Updated by Brett Smith 11 months ago

  • Status changed from Resolved to In Progress

Propose the following Jenkins organization:

  • developer-run-tests-sdk-python-ruby grows to build and linkcheck the PySDK documentation
  • developer-run-tests-doc-and-sdk-R becomes developer-run-tests-sdk-java-R. It tests both SDKs, and builds and linkchecks their documentation. It gets removed from developer-run-tests. It gets added to the release checklist as one of the jobs that needs to run and pass before a release can go out.
  • developer-run-tests-remainder just needs to get updated to match.
  • The script that publishes to doc.arvados.org needs to be updated to run with sdks=all.
Actions #11

Updated by Brett Smith 11 months ago

21494-sdk-doc-linkchecker @ 810846168f8e14f63caefac534b843b7681b881f - developer-run-tests: #4052

Further testing revealed that in order for us to run linkchecker as planned in Jenkins, the linkchecker task needed some updates to match the SDK selection code.

  • All agreed upon points are implemented / addressed.
    • N/A? Hopefully it's clear how this is a bridge between the previous branch and the desired Jenkins changes discussed at standup.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • This branch does not complete the story, but Jenkins reconfiguration happens outside the repository
  • Code is tested and passing, both automated and manual, what manual testing was done is described
    • See above, also successfully ran test doc sdks=none and test doc sdks=all locally.
  • Documentation has been updated.
    • N/A, covered by previous branch
  • Behaves appropriately at the intended scale (describe intended scale).
    • No change in scale
  • Considered backwards and forwards compatibility issues between client and server.
    • As before, this is backwards-compatible with the previous "no-sdk" knobs
  • Follows our coding standards and GUI style guidelines.
    • Yes
Actions #12

Updated by Tom Clegg 11 months ago

LGTM, thanks.

Actions #13

Updated by Brett Smith 11 months ago

Brett Smith wrote in #note-10:

Propose the following Jenkins organization:

  • developer-run-tests-sdk-python-ruby grows to build and linkcheck the PySDK documentation
  • developer-run-tests-doc-and-sdk-R becomes developer-run-tests-sdk-java-R. It tests both SDKs, and builds and linkchecks their documentation. It gets removed from developer-run-tests. It gets added to the release checklist as one of the jobs that needs to run and pass before a release can go out.
  • developer-run-tests-remainder just needs to get updated to match.

All the Jenkins parts of this are done (so, not the release checklist bits).

developer-run-tests: #4054

developer-run-tests-doc-sdk-java-R: #2222 (the long changelist is because I pasted the wrong commit hash previously, my bad)

I am very sorely tempted to reorganize developer-run-tests into just three jobs: the current -remainder, -workbench2, and then one that combines -api, -fuse, and -sdk-ruby-python. All three jobs would run in ~25 minutes, so we'd get a very even load with no increase in total runtime. This would free up Jenkins slots so that, e.g., two developer-run-tests jobs could run entirely in parallel. But before I do that, I wanted to at least note here that the planned changes are done and successful.

I saved copies of the original job scripts before I edited them, so we have those in case we need to do a quick revert or something.

Actions #14

Updated by Brett Smith 11 months ago

Hm, d-r-t-sdk-ruby-python is not actually running the Python tests. The command is in the console output and run-tests.sh seems to run afterwards but it doesn't seem to do anything. I'll look at it in the morning, it looks likely to be Python 2/3 cruft.

Actions #15

Updated by Brett Smith 11 months ago

Brett Smith wrote in #note-13:

I am very sorely tempted to reorganize developer-run-tests into just three jobs: the current -remainder, -workbench2, and then one that combines -api, -fuse, and -sdk-ruby-python. All three jobs would run in ~25 minutes, so we'd get a very even load with no increase in total runtime. This would free up Jenkins slots so that, e.g., two developer-run-tests jobs could run entirely in parallel.

developer-run-tests: #4055 - Note that I moved the sdk/ruby tests to -remainder because that was easier. They take 2sec so they're not a set we're trying to runtime-optimize.

I also updated developer-run-tests-doc-sdk-java-R to linkcheck the documentation with all the SDKs, not just Java and R. This provides a more thorough test before release, which seems helpful. I thought this would happen as part of a separate "update doc.arvados.org" Jenkins job but there is no such job.

developer-run-tests-doc-sdk-java-R: #2223

Actions #16

Updated by Brett Smith 11 months ago

The script that publishes doc.arvados.org is updated. At this point all that's left is updating the release checklist.

Actions #17

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2024-02-28 sprint to Development 2024-03-13 sprint
Actions #18

Updated by Peter Amstutz 10 months ago

  • Target version set to Development 2024-03-13 sprint
  • Tracker changed from Idea to Feature
Actions #19

Updated by Peter Amstutz 10 months ago

Brett Smith wrote in #note-16:

The script that publishes doc.arvados.org is updated. At this point all that's left is updating the release checklist.

https://dev.arvados.org/projects/arvados/wiki/Release_Checklist

Note the meta-process section, there's a file called TASKS which is used to create Redmine issues automatically and needs to be kept in sync with the wiki page.

Actions #20

Updated by Brett Smith 10 months ago

Peter Amstutz wrote in #note-19:

https://dev.arvados.org/projects/arvados/wiki/Release_Checklist

What seems most straightforward to me is for everyone to be on the same page that step #3, "Ensure that the release staging branch passes automated tests on jenkins." includes the new Java+R SDK test. I'm happy to add that, but it seems like it would be even better to list all of the Jenkins jobs that are expected to pass on the release branch on this page. What are they?

If we make that change, is that sufficient by itself? There would be no immediate need to update the TASKS file if we're just expanding the definition of that task.

The wiki could use a modernization pass in general. e.g., it still refers to the separate workbench2 repository. I'd be fine doing that with fresh eyes too if there's no objection.

Actions #21

Updated by Peter Amstutz 10 months ago

Brett Smith wrote in #note-20:

Peter Amstutz wrote in #note-19:

https://dev.arvados.org/projects/arvados/wiki/Release_Checklist

What seems most straightforward to me is for everyone to be on the same page that step #3, "Ensure that the release staging branch passes automated tests on jenkins." includes the new Java+R SDK test. I'm happy to add that, but it seems like it would be even better to list all of the Jenkins jobs that are expected to pass on the release branch on this page. What are they?

Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date -- and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.

However I think explicitly calling out the R SDK and Java SDK would make sense since they have been split off from the current "developer-run-tests".

If we make that change, is that sufficient by itself? There would be no immediate need to update the TASKS file if we're just expanding the definition of that task.

Sounds good.

The wiki could use a modernization pass in general. e.g., it still refers to the separate workbench2 repository. I'd be fine doing that with fresh eyes too if there's no objection.

Please do.

Actions #22

Updated by Brett Smith 10 months ago

Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date

There's no value to being vague like this when like half the rows in the table already link to other specific Jenkins jobs. And yeah, half those links were broken. Documentation needs to be kept up-to-date.

and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.

Documented this. Again, there are already other tasks with similar splits.

Please do.

https://dev.arvados.org/projects/arvados/wiki/Release_Checklist/diff?utf8=%E2%9C%93&commit=View+differences&version=70&version_from=69

Actions #23

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2024-03-13 sprint to Development 2024-03-27 sprint
Actions #24

Updated by Peter Amstutz 10 months ago

Brett Smith wrote in #note-22:

Well, it is slightly vague just to account for changes in Jenkins configuration without making the process go out of date

There's no value to being vague like this when like half the rows in the table already link to other specific Jenkins jobs. And yeah, half those links were broken. Documentation needs to be kept up-to-date.

and because for new releases this generally means the current "developer-run-tests" but for point releases it means the "2.X-developer-run-tests" because we archive the old Jenkins jobs to make sure they're still runnable.

Documented this. Again, there are already other tasks with similar splits.

Please do.

https://dev.arvados.org/projects/arvados/wiki/Release_Checklist/diff?utf8=%E2%9C%93&commit=View+differences&version=70&version_from=69

LGTM.

The only typo I noticed is you deleted the footnote at the bottom of the page without deleting the reference to the footnote (1) on the steps.

Actions #25

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2024-03-27 sprint to Development 2024-04-10 sprint
Actions #26

Updated by Peter Amstutz 10 months ago

  • Target version changed from Development 2024-04-10 sprint to Development 2024-03-27 sprint
Actions #27

Updated by Peter Amstutz 10 months ago

  • Status changed from In Progress to Resolved
Actions #28

Updated by Peter Amstutz 8 months ago

  • Release set to 70
Actions

Also available in: Atom PDF