Feature #490
open
Classify web-hits as relevant/not-relevant
Added by Madeleine Ball over 14 years ago.
Updated about 13 years ago.
Description
Each time a variant is seen in a new genome, it should be queued for web-search. The search should take place on chr position, rsID, and when available gene/amino acid change (both one letter and three letter abbreviations).
Logged in users should be able to classify web-search results as relevant/not-relevant. (Incrementing a counter for "relevant" or "not relevant"; user can change their mind but only one "vote" per user!)
Implementation notes:
- Add columns to flat_summary table for autoscore, web hits, genome hits, webscore... and refresh it
- Relax web search criteria: include all single-genome-hit variants
- Add long form AA (and rsid where applicable) to search terms in web search
- Requeue old web searches (to pick up rsid and long form AA results)
- Web hit vote history: {variant, url, oid, timestamp, score}
- Web hit current vote: {variant, url, score}
- UI: "vote yes" and "vote no" buttons (immediate ajax call)
- During vote event, update flat summary if webscore has changed as a result
- Tie = relevant, otherwise majority
Questions:
- Is webscore=0 (not relevant) suitable for variants with no web search results?
Sub-tasks:
- This should be accompanied by a report which shows the number of variants seen in only one genome and sorted by autoscore (a column in the report).
- The dump should have a column for "relevance" which can have the following values:
Relevant (at least 1 user identified a relevant web-hit)
Not-Relevant (at least 1 user identified every web-hit as not relevant)
Not-Reviewed
The report should be a list of variants only in one genome, sorted according to autoscore and randomized within each autoscore.
IE, in order:
[randomized list of variants in only one genome with autoscore = 6]
followed by
[randomized list of variants in only one genome with autoscore = 5]
...
etc.
It would be nice if we also had a column summarizing the evaluated web hits so far (ie, someone has evaluated this as having N valid hits and M nonvalid hits). It should not report the raw number of web hits. (I don't want this to bias our behavior in going through these evaluations for the purpose of the paper, someday we may wish to add it of course.)
The voting system is in place, so you can vote relevant / not relevant for each web hit.
The current decision (if any) for each web hit is shown on the left side of the link.
If you're logged in, you can vote by clicking the "thumbs up" or "thumbs down" icons on the right side of the link.
UI fixes todo:
- Indicate what your current vote is for each link (perhaps by highlighting the icon that represents your existing vote)
- Improve icons (suggestions other than fixing the anti-alias/resize ugliness?)
- Tool-tips to indicate what the voting icons do
First stab at a report: http://evidence.personalgenomes.org/report?type=need-web-review
Also, the latest-flat dump has a "webscore" column:
- "Y" if at least one web hit has been voted "relevant" by a majority or a tie; or
- "-" if some web hits have not been voted on yet; or
- "N" otherwise (i.e. all web hits are voted "not relevant", or there are no web hits)
This is great! Can you increase the number of pages displayed at:
- Project changed from 19 to GET-Evidence
- Category deleted (
GET-Evidence)
- Status changed from New to In Progress
- Priority changed from Urgent to Normal
- % Done changed from 0 to 80
Also available in: Atom
PDF