Project

General

Profile

Actions

Feature #16491

open

Local/Samba/NFS Arvados uploads in pure Golang

Added by Stanislaw Adaszewski almost 4 years ago. Updated 27 days ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

A frequent use case we are facing is uploading data from hosts:

1) without a Python installation,
2) where we have no root access
3) and which might have IP-based access to NFS or Samba resources but do not give rights to mount those resources to regular users.

Having a single standalone binary that could be transferred to such a host and used for Arvados uploads is desired because:
- it is time-consuming and bothersome to install Python, arvados-api-client and arvados certificates and/or modify httplib2 not to verify the certificates every time we need to upload something from a new host

The upload directly from those hosts is desired because:
- when working remotely and using e.g. Cyberduck to transfer data from a Samba share to Arvados' Webdav, the entire dataset which can range anywhere from GBs to TBs has to go via our private Internet connection
- UDP tunneling through SSH is complicated but (sometimes) needed for NFS/Samba

Pure Golang building blocks for accessing NFS and Samba seem to be there:
https://github.com/vmware/go-nfs-client
https://godoc.org/github.com/hirochachacha/go-smb2

There is also the Go SDK for Arvados.

It should be a matter of pulling those together to offer a single standalone binary alternative to arv-put that would support:

- upload of local files/directories
- upload of Samba files/directories
- upload of NFS files/directories
- creating (in the specified project) or updating existing collections

I think such a tool would be great. Again - just a private opinion.


Related issues

Related to Arvados Epics - Idea #16082: Port client tools to GoNewActions
Actions #1

Updated by Peter Amstutz almost 4 years ago

  • Related to Idea #16082: Port client tools to Go added
Actions #2

Updated by Peter Amstutz almost 4 years ago

Hi Stanislaw,

We have a plan to replace the Python utilities (especially arv-get/arv-put) with Go versions to simplify dependencies, it is part of the epic #16082. That's helpful for uploading data to Arvados from the client.

For ingesting data by downloading from a server, we recommend writing a small CWL CommandLineTool or Workflow which has NetworkAccess enabled and fetches data using curl or smbclient or some other protocol client.

Actions #3

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #4

Updated by Peter Amstutz 27 days ago

  • Target version set to Future
Actions

Also available in: Atom PDF