Over the past year the AutDB scoring advisory panel has looked carefully at the literature on the genetics of autism in an effort to come up with criteria that could be used to assess the strength of the evidence for each gene. As has been mentioned elsewhere, this was a difficult task, given the wide-ranging nature of the genetic evidence. The complexity of the data, encompassing both rare and common variant approaches, means that there are of course many ways in which one might do this. Our goal was to come up with some sensible rules to assess the relative value of sample size, statistical significance, replication, functional evidence, and other kinds of data that exist for each gene.
We set out with three guiding principles. First, our emphasis would be on evidence from human genetics. While functional studies will eventually be of great importance, and knockout mice of great value, we considered the gold standard for relevance to autism to be the study of genotypes in human cohorts. Second, we started with no assumptions about individual genes, and indeed found that the evidence for many 'high profile' genes is surprisingly weak. Finally, we emphasized the importance of community feedback as a way to improve the scoring criteria and the scores themselves.
Although the annotation criteria (link to annotation criteria) may seem complex at first glance, they can be usefully summarized as follows:
Syndromic
We recognized that genes predisposing to autism in the context of a syndromic disorder (e.g. fragile X syndrome), should be placed in a separate category (S). Any such genes that also have evidence implicating them in idiopathic autism will have a number in front of the S indicating the strength of that evidence (e.g. 3S, which would be listed both in the S category and in category 3).
Categories 1 (high confidence) and 2 (strong candidate)
We considered a rigorous statistical comparison between cases and controls, yielding genome-wide statistical significance, with independent replication to be the strongest possible evidence for a gene. These criteria were relaxed slightly for category 2.
Categories 3 (suggestive evidence) and 4 (minimal evidence)
The literature is replete with relatively small studies of candidate genes, using either common or rare variant approaches, which do not reach the criteria set out for categories 1 and 2. Genes that had two such lines of evidence supporting it were placed in category 3, and those with one line of evidence were placed in category 4. Some additional lines of 'accessory evidence' (indicated as 'acc' in the score cards) could also boost a gene from category 4 to 3
Categories 5 (hypothesized but untested) and 6 (evidence does not support a role)
The list of genes in AutDB is inclusive, and as such there are genes that have been implicated solely by evidence in model organisms, or other evidence of a marginal nature. These genes were placed in category 5, as they have not yet been rigorously tested in a human cohort. Category 6 is for those genes that have been so tested, but where the weight of the evidence argues against a role in autism.
Users will note that very few genes, in our estimation, fall into categories 1 and 2. For both common and rare variants, with a small number of exceptions, sample sizes are as yet too small to reliably detect true associations. Larger collections, emerging exome sequence data, and relatively inexpensive genotyping platforms should combine over the next couple of years to change this.
Finally, it is important to note that there is much still to be done. In the coming months additional genes will be scored, as will multi-genic copy number variants. In the meantime we look forward to feedback and the opportunity to improve this resource for the autism research community.