From NeuroLex

Jump to: navigation, search
Category:Resource:1000 Genomes: A Deep Catalog of Human Genetic Variation
Abbrev 1000 Genomes Project  +
CurationStatus curated  +
DefiningCitation  +
Definition Database of genomic sequence data spanning Database of genomic sequence data spanning several human populations including many families, to provide a comprehensive resource on human genetic variation that will be made available quickly to the worldwide scientific community through freely accessible public databases. Redundant sequencing on various platforms and by different groups of scientists of the same samples can be compared. The goal Project is to find most genetic variants that have frequencies of at least 1% in the populations studied. This goal can be attained by sequencing many individuals lightly. To sequence a person's genome, many copies of the DNA are broken into short pieces and each piece is sequenced. The many copies of DNA mean that the DNA pieces are more-or-less randomly distributed across the genome. The pieces are then aligned to the reference sequence and joined together. To find the complete genomic sequence of one person with current sequencing platforms requires sequencing that person's DNA the equivalent of about 28 times (called 28X). If the amount of sequence done is only an average of once across the genome (1X), then much of the sequence will be missed, because some genomic locations will be covered by several pieces while others will have none. The deeper the sequencing coverage, the more of the genome will be covered at least once. Also, people are diploid; the deeper the sequencing coverage, the more likely that both chromosomes at a location will be included. In addition, deeper coverage is particularly useful for detecting structural variants, and allows sequencing errors to be corrected. Sequencing is still too expensive to deeply sequence the many samples being studied for this project. However, any particular region of the genome generally contains a limited number of haplotypes. Data can be combined across many samples to allow efficient detection of most of the variants in a region. The Project currently plans to sequence each sample to about 4X coverage; at this depth sequencing cannot provide the complete genotype of each sample, but should allow the detection of most variants with frequencies as low as 1%. Combining the data from 2500 samples should allow highly accurate estimation (imputation) of the variants and genotypes for each sample that were not seen directly by the light sequencing. The sequence and alignment data generated by the 1000genomes project is made available as quickly as possible via their mirrored ftp sites. as possible via their mirrored ftp sites.
ExampleImage 1000 Genomes Project.PNG +
Has default formThis property is a special property in this wiki. Resource  +
Has role Database +
Id nlx_143819  +
Is part of Wellcome Trust Sanger Institute; Hinxton; United Kingdom +, Harvard Medical School; Massachusetts; USA +
Keywords Human +, Genetic variation +, Gene +, Human gene +, Next-generation sequencing +, Sequence +, Alignment +, Genome +
Label Resource:1000 Genomes: A Deep Catalog of Human Genetic Variation  +
Modification dateThis property is a special property in this wiki. 26 March 2014 22:44:32  +
ModifiedDate 26 March 2014  +
Page has default formThis property is a special property in this wiki. Resource  +
PublicationLink  +
RelatedTo Resource:OMICtools +
Species Human +
SuperCategory Resource  +
Synonym International 1000 Genomes Project  +, 1000 Genomes  +
Categories Resource
hide properties that link here 
Resource:1000 Genomes Project and AWS + Is part of
Resource:ART +, Resource:BioSample Database at EBI +, Resource:MOSAIK + RelatedTo


Enter the name of the page to start browsing from.