Project

General

Profile

Idea #18103

Updated by Peter Amstutz over 2 years ago

Script that can run from a cron job or a workflow that scans S3 bucket, finds data, determines batch, sample id, and type.    Copies data to Arvados, sets batch id, and updates status of sample to "sequenced". 

 Path is (bucket) batch / sample id / sample files: 

 (data-release) MM-002DNA/ MM_0026_DNA_T_04_01/ MM_0026_DNA_T_04_01_L001_R1_001.fastq.gz 

 Match sample id to existing sample id & upload to Arvados.    Also find or create batch and associate sample id with batch. 

Back