Project

General

Profile

Actions

Idea #13756

closed

Convert 10 Harvard PGP GFF to CGF [CWL]

Added by Abram Connelly almost 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
Target version:
-
Start date:
08/06/2018
Due date:
Story points:
-

Description

Most of the components should be in place to do a full conversion from GFF to CGF as a common workflow language (CWL) pipeline. As an initial pass to make sure each of the components is working properly and to fix issues that come up, convert 10 Harvard PGP GFF files to CGF.

  • 10 Harvard GFF files have been taken from the website (build 37) and put into collection 6318c4d7099cb65b4577c5d0595dd412+2589
  • They have been pre-processed to be bgzip'd and indexed via tabix (the script that did the recompression and index is in the .scripts directory in the collection). This should eventually be a CWL pipeline but in the interest of expediency this was done as a pre-processing step to focus on the conversion workflow.
  • The 10 datasets will create a stand-alone SGLF tile library which will be used for the final conversion.
  • The workflow should consist of conversion with checks after each major step:
    - Convert GFF to FastJ
    - Check FastJ
    - Collect FastJ into SGLF
    - Check SGLF
    - Convert FastJ and SGLF to band and CGF
    - Check CGF against the original GFF

This will be a whole genome conversion for each dataset.

This ticket will be considered completed when the 10 datasets have been converted successfully (with tests passing) and with timing results reported.


Subtasks 1 (0 open1 closed)

Task #13915: Review #13756, branch 13756-test-10-harvard-pgp-gff-to-cgfClosedSarah Zaranek08/06/2018Actions
Actions

Also available in: Atom PDF