Cohorts
In many studies involving large numbers of autistic and control individuals, case cohorts are generated by using individuals that are members of large repositories such as the Autism Genetic Resource Exchange (AGRE), the Autism Genome Poject (AGP), the International Molecular Genetic Study of Autism Consortium (IMGSAC), or the Simons Simplex Collection (SSC).
When available, the following information on both case and control cohorts is extracted and presented in the CNV module as population data:
Description. A brief synopsis of the cohort, including the source of the individuals within the cohort.
Cohort size. Case and control cohorts come in a wide range of sizes (see graphs below). Case cohorts of smaller sizes frequently provide more information on the phenotypic characteristics of affected individuals within the cohort, but are of less significance in statistically determining the pathogenic relevance of a CNV at a given locus across populations. On the other hand, larger case cohorts are more useful in statistically determining pathogenic CNV relevance, but typically they provde far less information on the phenotypic characteristics of affected individuals.
Diagnosis. Oftentimes, the diagnosis criteria (ADI-R, ADOS, etc.) is described, as is the number of individuals with specific primary diagnoses, such as autism, Asperger's, or PDD-NOS.
Age. Typically given as either a range of ages or a mean age.
Gender. Males are diagnosed with ASD approximately 3x more than females. As such, large autistic case cohorts are typically designed to reflect this disparity, with roughly 70-85% of individuals within a case cohort being male. Control cohorts, on the other hand, are typically 50% male.
Geographical Ancestry. The majority of cohorts are predominantly of Caucasian/European origin. As such, determining the pathogenic relevance of a CNV at a given locus across ethnic groups is difficult.
Each cohort in the CNV module dataset is assigned a name (or cohort ID) that consists of the first author and year of publication of the report in which the cohort is described, the disease being investigated, whether the cohort is a discovery cohort or a replication cohort, and whether the cohort consists of cases (i.e. individuals diagnosed with the disease of interest) or controls. While all reports in the database feature a discovery case cohort, only a few also describe a replication case cohort, in which the authors attempt to replicate their findings in the discovery cohort sample with a new population of cases.
For example, for the ASD discovery case cohort described in Pinto D 2010, the name of the cohort in the module would be:
pinto_10_ASD_discovery_cases
On the other hand, for the ASD replication control cohort described in Glessner JT 2009, the name of the cohort would be:
glessner_09_ASD_replication_controls