arrow

From NeuroLex

Jump to: navigation, search



Resource:Sequence Tag Alignment and Consensus Knowledgebase Database

Name: Resource:Sequence Tag Alignment and Consensus Knowledgebase Database
Description:

The STACKdb is knowledgebase generated by processing EST and mRNA sequences obtained from GenBank through a pipeline consisting of masking, clustering, alignment and variation analysis steps. The STACK project aims to generate a comprehensive representation of the sequence of each of the expressed genes in the human genome by extensive processing of gene fragments to make accurate alignments, highlight diversity and provide a carefully joined set of consensus sequences for each gene.
The STACK project is comprised of the STACKdb human gene index, a database of virtual human transcripts, as well as stackPACK, the tools used to create the database. STACKdb is organized into 15 tissue-based categories and one disease category.
STACK is a tool for detection and visualization of expressed transcript variation in the context of developmental and pathological states. The data system organizes and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity.
The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index
The stackPACK transcript reconstruction and variation analysis system allows the rapid and accurate processing of EST and mRNA data through a pipeline consisting of a series of steps including masking, loose clustering, assembly and alignment, alignment analysis for variation in transcripts and linking of non-overlapping clusters by clone ID. The system is unique due to its visualization tools and efficient data management, using a relational database. StackPACK can be accessed either through command line or through a web-based interface.
The STACK_PACK clustering system has been applied to dbEST release 121598. 64% of 1313103 Homo sapiens ESTs are condensed into 143,885 tissue level multiple sequence clusters; linking through clone-ID annotations produces 68,701 total assemblies, such that 81% of the original input set is captured in a STACK multiple sequence or linked cluster.
Indexing of alignments by substituent EST accession allows browsing of the data structure and its cross-links to UniGene. STACK meta-clusters consolidate a greater number of ESTs by a factor of 1.86 with respect to the corresponding UniGene build. Fidelity comparison with genome reference sequence AC004106 demonstrates consensus expression clusters that reflect significantly lower spurious repeat sequence content and capture alternate splicing within a whole body index cluster and three STACK v2.3 tissue-level clusters.

Sponsors: This work was originally funded under U.S. Department of Energy grant DE-FC03-95ER62062 (W.A.H.) and S.A. Foundation for Research grant GUN 2039524 (W.A.H.)

Other Name(s): STACKdb
Resource Type(s): Database, Software resource, Data visualization software
Keywords: exonic, expressed, expressed sequence tag (est), expression, fragment, gene, alignment, alternative gene, cdna, clone, cluster, developmental, disease, diversity, genome, homo sapiens, human, isoform, knowledgebase, meta-cluster, mrna, pathological, sequence, tissue, transcript, variant, visualization
Resource: Resource
URL: http://ww2.sanbi.ac.za/Dbases.html
Id: nif-0000-20946
Link to OWL / RDF: Download this content as OWL/RDF

Curation status: Uncurated

This resource will be curated within 7 days.

For Resource Owners:
After the resource is curated, you may create a sitemap, which will help keep your registry description up-to-date and inform search engines about your resource.

Note: For a new resource, the website's URL must first be verified by a NIF curator before you may proceed.

Learn more about what NIF can do for your resource.
Proudly proclaim your inclusion in NIF by displaying the "Registered with NIF" button on your site. Please login to create the sitemap. (top right)

This page uses this default form:Resource

Contributors

Aarnaud, Ccdbuser, Nifbot2



bookmark
Facts about Resource:Sequence Tag Alignment and Consensus Knowledgebase DatabaseRDF feed
CurationStatuscurated  +
DefiningCitationhttp://ww2.sanbi.ac.za/Dbases.html  +
Definition

The STACKdb is knowledgebase gener
The STACKdb is knowledgebase generated by processing EST and mRNA sequences obtained from GenBank through a pipeline consisting of masking, clustering, alignment and variation analysis steps. The STACK project aims to generate a comprehensive representation of the sequence of each of the expressed genes in the human genome by extensive processing of gene fragments to make accurate alignments, highlight diversity and provide a carefully joined set of consensus sequences for each gene.
The STACK project is comprised of the STACKdb human gene index, a database of virtual human transcripts, as well as stackPACK, the tools used to create the database. STACKdb is organized into 15 tissue-based categories and one disease category.
STACK is a tool for detection and visualization of expressed transcript variation in the context of developmental and pathological states. The data system organizes and reconstructs human transcripts from available public data in the context of expression state. The expression state of a transcript can include developmental state, pathological association, site of expression and isoform of expressed transcript. STACK consensus transcripts are reconstructed from clusters that capture and reflect the growing evidence of transcript diversity.
The comprehensive capture of transcript variants is achieved by the use of a novel clustering approach that is tolerant of sub-sequence diversity and does not rely on pairwise alignment. This is in contrast with other gene indexing projects. STACK is generated at least four times a year and represents the exhaustive processing of all publicly available human EST data extracted from GenBank. This processed information can be explored through 15 tissue-specific categories, a disease-related category and a whole-body index
The stackPACK transcript reconstruction and variation analysis system allows the rapid and accurate processing of EST and mRNA data through a pipeline consisting of a series of steps including masking, loose clustering, assembly and alignment, alignment analysis for variation in transcripts and linking of non-overlapping clusters by clone ID. The system is unique due to its visualization tools and efficient data management, using a relational database. StackPACK can be accessed either through command line or through a web-based interface.
The STACK_PACK clustering system has been applied to dbEST release 121598. 64% of 1313103 Homo sapiens ESTs are condensed into 143,885 tissue level multiple sequence clusters; linking through clone-ID annotations produces 68,701 total assemblies, such that 81% of the original input set is captured in a STACK multiple sequence or linked cluster.
Indexing of alignments by substituent EST accession allows browsing of the data structure and its cross-links to UniGene. STACK meta-clusters consolidate a greater number of ESTs by a factor of 1.86 with respect to the corresponding UniGene build. Fidelity comparison with genome reference sequence AC004106 demonstrates consensus expression clusters that reflect significantly lower spurious repeat sequence content and capture alternate splicing within a whole body index cluster and three STACK v2.3 tissue-level clusters.

Sponsors: This work was originally funded under U.S. Department of Energy grant DE-FC03-95ER62062 (W.A.H.) and S.A. Foundation for Research grant GUN 2039524 (W.A.H.)

grant GUN 2039524 (W.A.H.)

Has default formThis property is a special property in this wiki.Resource  +
Has roleDatabase  +, Software resource  +, and Data visualization software  +
Idnif-0000-20946  +
KeywordsExonic  +, Expressed  +, Expressed sequence tag (est)  +, Expression  +, Fragment  +, Gene  +, Alignment  +, Alternative gene  +, Cdna  +, Clone  +, Cluster  +, Developmental  +, Disease  +, Diversity  +, Genome  +, Homo sapiens  +, Human  +, Isoform  +, Knowledgebase  +, Meta-cluster  +, Mrna  +, Pathological  +, Sequence  +, Tissue  +, Transcript  +, Variant  +, and Visualization  +
LabelResource:Sequence Tag Alignment and Consensus Knowledgebase Database  +
ModifiedDate22 June 2013  +
Page has default formThis property is a special property in this wiki.Resource  +
SuperCategoryResource  +
SynonymSTACKdb  +