Project

General

Profile

Actions

Idea #5914

open

[DRAFT] Provide one clear way for users to get data from an external source into Arvados

Added by Brett Smith almost 9 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Documentation
Target version:
Start date:
Due date:
Story points:
-
Release:
Release relationship:
Auto

Description

Background

Users frequently want to work with data available from public sites. It's important that it be easy for them to get it into Arvados so they can start processing on it as quickly as possible.

Proposed solution

A page of documentation, a step-by-step walkthrough describing the best way to do this.

  • How to get the data with wget, including all the flags you want to use to download the data reliably over an unreliable link, with as much metadata as possible (like Last-Modified dates)
  • How to get the data into Arvados - probably with the writable FUSE mount
  • How to get updates - this can be with wget if all the right metadata is available, otherwise… this is where the story gets a little fuzzy.
Actions

Also available in: Atom PDF