Feature #516: Add conservation track from ucsc

Blosum100 is a simple measure of conservation, but is a better measure of how disruptive a particular amino acid change will be. We should also have the phyloP score, downloaded from ucsc, for the base in consideration, where available.

The phyloP score is the -log(p) measure of predicted conservation for the specific base (as opposed to phastCons which is a measure of conservation of the base in the context of flanking bases). Note that the measure also includes negative values, which indicate the base in question is predicted to be fast-evolving.

I've downloaded the broadest track, vertebrates (there are also mammal and primates), and written a very simple program (lookup.c) to pull out values given chr# and location. Interestingly, the lookup values agree qualitatively with the ucsc genome browser values, but they are not an exact match - perhaps the phyloP switches used for the ucsc ftp files were slightly different from switches used for the data in the genome browser. It might be worth re-running some of the data.

All can be found on:


Closing this ticket should include folding these data into the GET INSTALL scripts.

[2009, Pollard, Hubisz, Rosenbloom, et al, Gen Res]


