Skip to main content

Data standardization (MEP/MIxS)

T-BAS saves placements and associated specimen metadata as Metadata Enhanced PhyloXML (MEP) files.

MEP is designed to support:

  • interoperable packaging of placements + metadata
  • downstream analysis in a phylogenetic context
  • consistent handling of metadata across T-BAS and other DeCIFR tools

MIxS compatibility

MEP files are structured to align with the MIxS (Minimum Information about any (x) Sequence) family of standards (Genomic Standards Consortium), and use standardized headers where possible.

tip

Use the provided metadata template when running a placement so your samples can be interpreted and re-used consistently.

Relationship to XML and PhyloXML

  • XML is the underlying markup language.
  • PhyloXML is an XML extension for representing phylogenies and associated information.
  • MEP extends PhyloXML with additional tags needed for placements and metadata in T-BAS/DeCIFR.

Core MEP extensions

MEP adds/extends the following logical components:

OTUs

The cifr:otus / cifr:otu structures store:

  • OTU identifiers
  • taxonomic assignments
  • query sequences
  • per-sample metadata and placement summaries

MEP: OTUs schema view

Attributes

The cifr:attributes structures store specimen metadata as name/value pairs in the tree context.

MEP: attributes schema view

Genes

The cifr:genes / cifr:gene structures store alignment metadata, including:

  • locus name
  • number of characters (nchar)
  • excluded character sets (exset) for alignment masking
note

This supports multi-locus datasets and reproducible interpretation of alignments.

Schema definitions

MEP uses two associated schema definitions:

  • one that demonstrates how custom tags are added to PhyloXML
  • one that defines custom tags in the http://www.cifr.ncsu.edu namespace

(These are linked from the legacy manual and can be added here as stable references once you decide which public URLs you want to support long-term.)