arrow

From NeuroLex

(Redirected from Category:Resource:The Cancer Text Information Extraction System)
Jump to: navigation, search



Resource:caTIES - Cancer Text Information Extraction System

Name: Resource:caTIES - Cancer Text Information Extraction System
Description: The Cancer Text Information Extraction System (caTIES) provides tools for de-identification and automated coding of free-text structured pathology reports. It also has a client that can be used to search these coded reports. The client also supports Tissue Banking and Honest Broker operations.

caTIES focuses on two important challenges of bioinformatics

  • Information extraction (IE) from free text
  • Access to tissue.

Regarding the first challenge, information from free-text pathology documents represents a vital and often underutilized source of data for cancer researchers. Typically, extracting useful data from these documents is a slow and laborious manual process requiring significant domain expertise. Application of automated methods for IE provides a method for radically increasing the speed and scope with which this data can be accessed. Regarding the second challenge, there is a pressing need in the cancer research community to gain access to tissue specific to certain experimental criteria. Presently, there are vast quantities of frozen tissue and paraffin embedded tissue throughout the country, due to lack of annotation or lack of access to annotation these tissues are often unavailable to individual researchers.

caTIES has three goals designed to solve these problems:

  • Extract coded information from free text Surgical Pathology Reports (SPRs), using controlled terminologies to populate caBIG-compliant data structures.
  • Provide researchers with the ability to query, browse and create orders for annotated tissue data and physical material across a network of federated sources. With caTIES the SPR acts as a locator to tissue resources.
  • Pioneer research for distributed text information extraction within the context of caBIG.

caTIES focuses on IE from SPRs because they represent a high-dividend target for automated analysis. There are millions of SPRs in each major hospital system, and SPRs contain important information for researchers. SPRs act as tissue locators by indicating the presence of tissue blocks, frozen tissue and other resources, and by identifying the relationship of the tissue block to significant landmarks such as tumor margins. At present, nearly all important data within SPRs are embedded within loosely-structured free-text. For these reasons, SPRs were chosen to be coded through caTIES because facilitating access to information contained in SPRs will have a powerful impact on cancer research. Once SPR information has been run through the caTIES Pipeline, the data may be queried and inspected by the researcher. The goal of this search may be to extract and analyze data or to acquire slides of tissue for further study. caTIES provides two query interfaces, a simple query dashboard and an advanced diagram query builder. Both of these interfaces are capable of NCI Metathesaurus, concept-based searching as well as string searching. Additionally, the diagram interface is capable of advanced searching functionalities.

An important aspect of the interface is the ability to manage queries and case sets. Users are able to vet query results and save them to case sets which can then be edited at a later time. These can be submitted as tissue orders or used to derive data extracts. Queries can also be saved, and modified at a later time.

caTIES provides the following web services by default: MMTx Service, TIES Coder Service
Other Name(s): Cancer Text Information Extraction System, Cancer Text Information Extraction System (caTIES)
Abbreviation: caTIES
Parent Organization: University of Pittsburgh School of Medicine; Pennsylvania; USA
Supporting Agency: National Cancer Institute, Resource:Cancer Biomedical Informatics Grid
Grant: R01 CA132672, contract #79207CBS10, U54 RR023506-01, U01 CA 091343
Resource Type(s): Data processing software, Web service
Resource: Resource
URL: http://caties.cabig.upmc.edu/
Id: nif-0000-33212
PMID: PMID 20442142
Related to: Resource:Cancer Biomedical Informatics Grid, Resource:Biositemaps
Availability: Open Source
Keywords: extraction, Cancer, code, de-identification, information, paraffin, pathology, research, structure, surgical, system, tissue, tool, text, natural language processing, Tissue Banking, Translational Research, Data Sharing, Collaboration, Natural Language Processing, Text-Processing, Text-Mining, Grid computing, Service Oriented Architecture, Query Visualization, Medical Record, Bioinformatics, Automated Coding
Link to OWL / RDF: Download this content as OWL/RDF

Curation status: Uncurated

This resource will be curated within 7 days.

For Resource Owners:
After the resource is curated, you may create a sitemap, which will help keep your registry description up-to-date and inform search engines about your resource.

Note: For a new resource, the website's URL must first be verified by a NIF curator before you may proceed.

Learn more about what NIF can do for your resource.
Proudly proclaim your inclusion in NIF by displaying the "Registered with NIF" button on your site. Please login to create the sitemap. (top right)

Notes

This page uses this default form:Resource

Contributors

Aarnaud, Ccdbuser



bookmark
Facts about Resource:caTIES - Cancer Text Information Extraction SystemRDF feed
AbbrevcaTIES  +
AvailabilityOpen Source  +
CurationStatuscurated  +
DefiningCitationhttp://caties.cabig.upmc.edu/  +
DefinitionThe Cancer Text Information Extraction Sys The Cancer Text Information Extraction System (caTIES) provides tools for de-identification and automated coding of free-text structured pathology reports. It also has a client that can be used to search these coded reports. The client also supports Tissue Banking and Honest Broker operations.

caTIES focuses on two important challenges of bioinformatics

  • Information extraction (IE) from free text
  • Access to tissue.

Regarding the first challenge, information from free-text pathology documents represents a vital and often underutilized source of data for cancer researchers. Typically, extracting useful data from these documents is a slow and laborious manual process requiring significant domain expertise. Application of automated methods for IE provides a method for radically increasing the speed and scope with which this data can be accessed. Regarding the second challenge, there is a pressing need in the cancer research community to gain access to tissue specific to certain experimental criteria. Presently, there are vast quantities of frozen tissue and paraffin embedded tissue throughout the country, due to lack of annotation or lack of access to annotation these tissues are often unavailable to individual researchers.

caTIES has three goals designed to solve these problems:

  • Extract coded information from free text Surgical Pathology Reports (SPRs), using controlled terminologies to populate caBIG-compliant data structures.
  • Provide researchers with the ability to query, browse and create orders for annotated tissue data and physical material across a network of federated sources. With caTIES the SPR acts as a locator to tissue resources.
  • Pioneer research for distributed text information extraction within the context of caBIG.

caTIES focuses on IE from SPRs because they represent a high-dividend target for automated analysis. There are millions of SPRs in each major hospital system, and SPRs contain important information for researchers. SPRs act as tissue locators by indicating the presence of tissue blocks, frozen tissue and other resources, and by identifying the relationship of the tissue block to significant landmarks such as tumor margins. At present, nearly all important data within SPRs are embedded within loosely-structured free-text. For these reasons, SPRs were chosen to be coded through caTIES because facilitating access to information contained in SPRs will have a powerful impact on cancer research. Once SPR information has been run through the caTIES Pipeline, the data may be queried and inspected by the researcher. The goal of this search may be to extract and analyze data or to acquire slides of tissue for further study. caTIES provides two query interfaces, a simple query dashboard and an advanced diagram query builder. Both of these interfaces are capable of NCI Metathesaurus, concept-based searching as well as string searching. Additionally, the diagram interface is capable of advanced searching functionalities.

An important aspect of the interface is the ability to manage queries and case sets. Users are able to vet query results and save them to case sets which can then be edited at a later time. These can be submitted as tissue orders or used to derive data extracts. Queries can also be saved, and modified at a later time.

caTIES provides the following web services by default: MMTx Service, TIES Coder Service
default: MMTx Service, TIES Coder Service
GrantCategory:R01 CA132672   +, Category:contract #79207CBS10   +, Category:U54 RR023506-01   +, and Category:U01 CA 091343   +
Has default formThis property is a special property in this wiki.Resource  +
Has roleData processing software  +, and Web service  +
Idnif-0000-33212  +
Is part ofUniversity of Pittsburgh School of Medicine; Pennsylvania; USA  +
KeywordsExtraction  +, Cancer  +, Code  +, De-identification  +, Information  +, Paraffin  +, Pathology  +, Research  +, Structure  +, Surgical  +, System  +, Tissue  +, Tool  +, Text  +, Natural language processing  +, Tissue Banking  +, Translational Research  +, Data Sharing  +, Collaboration  +, Natural Language Processing  +, Text-Processing  +, Text-Mining  +, Grid computing  +, Service Oriented Architecture  +, Query Visualization  +, Medical Record  +, Bioinformatics  +, and Automated Coding  +
LabelResource:caTIES - Cancer Text Information Extraction System  +
ModifiedDate23 August 2012  +
PMID20442142  +
Page has default formThis property is a special property in this wiki.Resource  +
RelatedToResource:Cancer Biomedical Informatics Grid  +, and Resource:Biositemaps  +
SuperCategoryResource  +
Supporting AgencyNational Cancer Institute  +, and Resource:Cancer Biomedical Informatics Grid  +
SynonymCancer Text Information Extraction System  +, and Cancer Text Information Extraction System (caTIES)  +