NIH Research Festival
SPARCLE (Subfamily Protein Architecture Labeling Engine) is a curation interface and corresponding database developed at the National Center for Biotechnology Information (NCBI) that allows Conserved Domain Database (CDD) curators to functionally characterize and label protein sequences that have been grouped by their sub-family domain architecture. Sub-family domain architectures result from annotating protein sequences with domain footprints provided by CDD, a resource that consists of a collection of well-annotated multiple sequence alignment models for ancient domains and full-length proteins. This resource includes NCBI-curated domains, which use 3D-structure information to explicitly define domain boundaries and provide insights into sequence/structure/function relationships, as well as protein and domain models imported from external source databases such as Pfam, SMART, and TIGRFAMs. The protein names and labels that are given reflect the underlying protein and domain model collections and therefore range from generic to very specific. In many cases they do provide concise functional annotations that are easier to interpret than raw representations of domain architecture. To date SPARCLE has been used to name several million prokaryotic proteins in RefSeq and cited in 22 million WP records.
Scientific Focus Area: Computational Biology
This page was last updated on Friday, March 26, 2021