consensus-sequence-block

A consensus sequence contains a sequence of IUPAC nucleotides and novel 
        variants.

        The nucleotides are specified in the DNA alphabet.
        
        The DNA alphabet consists of primary nucleotides (A, C, G, T).

        Wildcard IUPAC nucleotides (M, R, W, S, Y, K, V, H, D, B, X, N) may be 
        used if they are acceptable in the context in which they appear. The 
        default is to use all upper case letters. 

        The full specification of the IUPAC codes may be found here:
        (http://nar.oxfordjournals.org/content/13/9/3021.short)
        Cornish-Bowden A. Nomenclature for incompletely specified bases in 
        nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 1985; 
        13:3021-3030.

        Children:
        --------
        - sequence: (required, qty: 1) Nucleotide data for the consensus block.
        - variant:  (optional, 0 or more) If region-match is false, 
                    variant is expected to refer to the novel-variants.
        - sequence-quality: (optional, qty: 0 or more) A score for a sub-sequence 
                    specified by start and end (includes 'start', excludes 
                    'end') that indicates the quality of the read.

        Attributes:
        ----------  
        - reference-sequence-id: (required) Reference to a unique reference-sequence 
                     defined in this document under "consensus-sequence".  IDREF
                     must exactly match the ID for the reference-sequence.
        - start:    (required) Start position of a targeted region on contig,
                    0-based or space-counted coordinate system, closed-open range
        - end:      (required) End position of a targeted region on contig,
                    0-based or space-counted coordinate system, closed-open range
        - strand:   (optional) String value (eg. one of "-1", "1", "-", "+");
                    defaults to "+" if unspecified
        - phasing-group: Phasing group identifier - DEPRECATED. Use "phase-set". 
        - phase-set: Phase set identifier (string, optional)
        - continuity: (optional) True if this represents a continuous read, false 
                    if not continuous.
        - expected-copy-number:  (optional) Integer for how many copies of 
                    the sequence block were expected (0 to n).
        - description:  (optional) Text description of the targeted region, like "HLA-A exon 3"
        - xs:anyAttribute:  Custom use attribute for additional sequence 
                    information. (optional)

Element Information

Model

Attributes

QName Type Fixed Default Use Inheritable Annotation
continuity xs:boolean optional
description restriction of xs:string optional
end position-type optional
expected-copy-number restriction of xs:int optional
phase-set xs:string optional
phasing-group xs:string optional
reference-sequence-id xs:IDREF required
start position-type optional
strand restriction of xs:string optional
Wildcard: ANY attribute from ANY namespace

Used By

Complex Type consensus-sequence

Source

<xs:element name="consensus-sequence-block">
  <xs:annotation>
    <xs:documentation>A consensus sequence contains a sequence of IUPAC nucleotides and novel variants. The nucleotides are specified in the DNA alphabet. The DNA alphabet consists of primary nucleotides (A, C, G, T). Wildcard IUPAC nucleotides (M, R, W, S, Y, K, V, H, D, B, X, N) may be used if they are acceptable in the context in which they appear. The default is to use all upper case letters. The full specification of the IUPAC codes may be found here: (http://nar.oxfordjournals.org/content/13/9/3021.short) Cornish-Bowden A. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 1985; 13:3021-3030. Children: -------- - sequence: (required, qty: 1) Nucleotide data for the consensus block. - variant: (optional, 0 or more) If region-match is false, variant is expected to refer to the novel-variants. - sequence-quality: (optional, qty: 0 or more) A score for a sub-sequence specified by start and end (includes 'start', excludes 'end') that indicates the quality of the read. Attributes: ---------- - reference-sequence-id: (required) Reference to a unique reference-sequence defined in this document under "consensus-sequence". IDREF must exactly match the ID for the reference-sequence. - start: (required) Start position of a targeted region on contig, 0-based or space-counted coordinate system, closed-open range - end: (required) End position of a targeted region on contig, 0-based or space-counted coordinate system, closed-open range - strand: (optional) String value (eg. one of "-1", "1", "-", "+"); defaults to "+" if unspecified - phasing-group: Phasing group identifier - DEPRECATED. Use "phase-set". - phase-set: Phase set identifier (string, optional) - continuity: (optional) True if this represents a continuous read, false if not continuous. - expected-copy-number: (optional) Integer for how many copies of the sequence block were expected (0 to n). - description: (optional) Text description of the targeted region, like "HLA-A exon 3" - xs:anyAttribute: Custom use attribute for additional sequence information. (optional)</xs:documentation>
  </xs:annotation>
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="hmlns:sequence" minOccurs="1" maxOccurs="1"/>
      <xs:element ref="hmlns:variant" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element ref="hmlns:sequence-quality" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="reference-sequence-id" type="xs:IDREF" use="required"/>
    <xs:attribute name="start" type="hmlns:position-type" use="optional"/>
    <xs:attribute name="end" type="hmlns:position-type" use="optional"/>
    <xs:attribute name="strand" use="optional">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="-1"/>
          <xs:enumeration value="1"/>
          <xs:enumeration value="+"/>
          <xs:enumeration value="-"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="phasing-group" type="xs:string" use="optional"/>
    <!-- DEPRECATED: Use phase-set -->
    <xs:attribute name="phase-set" type="xs:string" use="optional"/>
    <xs:attribute name="continuity" type="xs:boolean" use="optional"/>
    <xs:attribute name="expected-copy-number" use="optional">
      <xs:simpleType>
        <xs:restriction base="xs:int">
          <xs:minInclusive value="0"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="description" use="optional">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:minLength value="1"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <!-- Custom use attribute for additional information (optional) -->
    <xs:anyAttribute/>
  </xs:complexType>
</xs:element>

Sample

< consensus-sequence xmlns = "http://schemas.nmdp.org/spec/hml/1.0" date = "2014-11-02" >

< reference-database name = "imgt-hla" description = "IMGT/HLA Database" version = "3.18.0" availability = "public" curated = "true" uri = "http://www.ebi.ac.uk/ipd/imgt/hla" >

< reference-sequence id = "ref1" name = "HLA-A*01:01:01:01" start = "0" end = "3053" accession = "HLA00001" uri = "http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db=imgthla;id=HLA00001" />

</ reference-database>

< consensus-sequence-block reference-sequence-id = "ref1" start = "29942124" end = "29944020" strand = "1" phase-set = "1" continuity = "true" expected-copy-number = "1" description = "HLA-A exon 1" >

< sequence>

TTTCTTGGAGCAGGTTAAACATGAGTGTCATTTCTTCAACGGGACGGAGCGGGTGCGGTTCCTGGACA GATACTTCTATCACCAAGAGGAGTACGTGCGCTTCGACAGCGACGTGGGGGAGTACCGGGCGGTGACG GAGCTGGGGCGGCCTAGCGCCGAGTACTGGAACAGCCAGAAGGACCTCCTGGAGCAGAGGCGGGCCGA GGTGGACACCTACTGCAGACACAACTACGGGGTTGTGGAGAGCTTCACA

</ sequence>

< variant reference-bases = "T" alternate-bases = "C" start = "29942937" end = "29943001" >

< variant-effect term = "missense_variant" />

</ variant>

< variant reference-bases = "CG" alternate-bases = "C" start = "29942999" end = "29943025" >

< variant-effect term = "frameshift_variant" />

</ variant>

< variant reference-bases = "A" alternate-bases = "AT" start = "29942760" end = "29942852" >

< variant-effect term = "stop_gained" />

</ variant>

< sequence-quality sequence-start = "29942937" sequence-end = "29943001" quality-score = "1.0" />

< sequence-quality sequence-start = "29942955" sequence-end = "29943020" quality-score = "1.0" />

</ consensus-sequence-block>

< consensus-sequence-block reference-sequence-id = "ref1" start = "32584109" end = "32584377" strand = "1" phase-set = "2" continuity = "true" expected-copy-number = "1" description = "HLA-B exon 1" >