Public:BWmeta format

From YaddaWiki

(Difference between revisions)
Jump to: navigation, search
(Sample metadata in BWmeta 1.2.0 format)
Line 95: Line 95:
   
   
  </'''element'''>
  </'''element'''>
 +
 +
== Rich text support ==
 +
 +
Starting with BWmeta 2.0.0, support for rich text is introduced for certain fields.
 +
By rich text we mean a mixed content of text and tags from namespaces other than <code>bwmeta</code>.
 +
For example, the following should be a valid fragment of BWmeta 1.2.1
 +
(assuming that suitable declarations of <code>xhtml</code> and <code>mathml</code> are present):
 +
  <'''description''' lang="eng" type="abstract">
 +
      <'''xhtml:em'''>Lorem ipsum</'''xhtml:em'''> dolor sit amet <'''mathml:math'''>[...]</'''mathml:math'''>
 +
  </'''description'''>
 +
The following fields support rich text:
 +
* name
 +
* description
 +
* tag
 +
* attribute
 +
 +
The change in attribute renders BWmeta 2.0.0 incompatible with BWmeta 1.2.0:  The following in BWmeta 1.2.0:
 +
  <'''attribute''' key="foo" value="bar"/>
 +
becomes the following in BWmeta 2.0.0:
 +
  <'''attribute''' key="foo">
 +
      <'''value'''>bar</'''value'''>
 +
  </'''attribute'''>
== Sample metadata in BWmeta 2.0.0 format ==
== Sample metadata in BWmeta 2.0.0 format ==

Revision as of 10:14, 6 October 2010

Service details
NameBWmeta
Code location
(relative to SVN root)
projects/dir/bwmeta-core
Javadoc
Contact personJakub Jurkiewicz

BWmeta is a general-purpose metadata format capable of describing entities such as: academic articles, books, audio recordings, laws, or molecular sequences. Originally, the format was designed from a need of a single, flexible, publisher-independent metadata format for documents hosted by the YADDA platform.

Contents

Basic tags

What follows is a brief summary of the most crucial tags in BWmeta format. A sample document described using the format is shown in the next section.

  • element contains metadata related to a given document or object.
  • name covers titles, subtitles, names, surnames, name alternatives, abbreviations, while description covers abstracts, comments. As a rule of the thumb: name tends to represent one-line texts while description ­– multi-line texts. Optional attributes: lang (ISO 639-1 or ISO 639-2), type (such as: surname, forenames, title) and sort-key.
  • attribute is a general-purpose tag. It is a tree with key-value pairs in each node. It is used to store information which does not map well to existing tags.
  • contents tag contains a tree of directory and file tags which describe content files (eg. type, size, checksums). Each file contains one or more location tags which contain URL of a copy of the described file.
  • contributor with a required attribute role (such as: author, editor, composer, translator) represents a person or institution that has contributed to the creation of the document/entity. Child tags include any number of name and description tags with different types, as well as references to affiliations (the affiliation tags are located directly under the element tag).
  • id tag consists of scheme (eg. DOI, PMID) and the actual value of the id.
  • structure tag contains references to all the parent elements (stored in the ancestor tag). For example, in the case of a journal article, these might be: the journal, the volume, and the issue containing the article.
  • relation tags contain references to other documents, for example bibliographic references.

Sample metadata in BWmeta 1.2.0 format

<?xml version="1.0"?>
<element id="bwmeta1.element.5d222ced-92d1-4266-a18f-6f0421389a21" version="42">
  <name lang="en">Collective dynamics of 'small-world' networks</name>
  <description lang="en">Networks of coupled dynamical systems have been used to model [...]</description>

  <structure hierarchy="bwmeta1.hierarchy-class.hierarchy_Journal" level="bwmeta1.level.hierarchy_Journal_Article" position="440–442">
    <ancestor level="bwmeta1.level.hierarchy_Journal_Journal" ref="bwmeta1.element.6d17104f-7ce8-4d2e-82c8-9c353345c8b6">
      <name>Nature</name>
    </ancestor>
    <ancestor level="bwmeta1.level.hierarchy_Journal_Volume" ref="bwmeta1.element.fd77fb98-0e86-4969-b5da-2570554ae774">
      <name>393</name>
    </ancestor>
    <ancestor level="bwmeta1.level.hierarchy_Journal_Issue" ref="bwmeta1.element.810510cd-f9cd-4ad7-8493-e5cabcb1c5a3">
      <name>6684</name>
    </ancestor>
  </structure>

  <contents>
    <file type="content" format="application/pdf" id="file.0f5bbc27-fc69-4a54-998d-e9fd47562768" size="279854">
      <location>yar://contents/0f5b/bc27/fc69/4a54/0f5bbc27-fc69-4a54-998d-e9fd47562768.pdf</location>
    </file>
    <file type="fulltext" format="text/plain" id="file.281badc6-82ae-4c60-8462-70ac55e52434" langs="en">
      <location>yar://fulltexts/281b/adc6/82ae/4c60/281badc6-82ae-4c60-8462-70ac55e52434.txt</location>
    </file>
  </contents>

  <affiliation id="aff.ef21a462-3a17-494f-b176-cccd20bb7c62">
    <text>Department of Theoretical and Applied Mechanics, Kimball Hall, Cornell University, Ithaca, New York 14853, USA</text>
  </affiliation>

  <contributor role="author">
    <person>
      <name type="canonical">Duncan J. Watts</name>
      <name type="surname">Watts</name>
      <name type="forenames">Duncan J.</name>
    </person>
    <affiliation ref="aff.ef21a462-3a17-494f-b176-cccd20bb7c62"/>
  </contributor>
  <contributor role="author">
    <person>
      <name type="canonical">Steven H. Strogatz</name>
      <name type="surname">Strogatz</name>
      <name type="forenames">Steven H.</name>
    </person>
    <affiliation ref="aff.ef21a462-3a17-494f-b176-cccd20bb7c62"/>
  </contributor>

  <id scheme="bwmeta1.id-class.DOI" value="10.1038/30918"/>
  <id scheme="bwmeta1.id-class.PMID" value="9623998"/>

  <relation type="reference">
    <attribute key="text" value="1. Winfree, A. T. The Geometry of Biological Time (Springer, New York, 1980)."/>
    <attribute key="parsed">
      <attribute key="type" value="book"/>
      <attribute key="position" value="1"/>
      <attribute key="authors">
        <attribute key="author" value="A. T. Winfree">
          <attribute key="forenames" value="A. T."/>
          <attribute key="surname" value="Winfree"/>
        </attribute>
      </attribute>
      <attribute key="title" value="The Geometry of Biological Time"/>
      <attribute key="publisher" value="Springer"/>
      <attribute key="city" value="New York"/>
      <attribute key="year" value="1980"/>
    </attribute>
  </relation>
  [...]

</element>

Rich text support

Starting with BWmeta 2.0.0, support for rich text is introduced for certain fields. By rich text we mean a mixed content of text and tags from namespaces other than bwmeta. For example, the following should be a valid fragment of BWmeta 1.2.1 (assuming that suitable declarations of xhtml and mathml are present):

 <description lang="eng" type="abstract">
     <xhtml:em>Lorem ipsum</xhtml:em> dolor sit amet <mathml:math>[...]</mathml:math>
 </description>

The following fields support rich text:

  • name
  • description
  • tag
  • attribute

The change in attribute renders BWmeta 2.0.0 incompatible with BWmeta 1.2.0: The following in BWmeta 1.2.0:

 <attribute key="foo" value="bar"/>

becomes the following in BWmeta 2.0.0:

 <attribute key="foo">
     <value>bar</value>
 </attribute>

Sample metadata in BWmeta 2.0.0 format

<?xml version="1.0"?>
<element id="bwmeta1.element.5d222ced-92d1-4266-a18f-6f0421389a21" version="42">
  <name lang="en">Collective dynamics of 'small-world' networks</name>
  <description lang="en">Networks of coupled dynamical systems have been used to model [...]</description>

  <structure hierarchy="bwmeta1.hierarchy-class.hierarchy_Journal" level="bwmeta1.level.hierarchy_Journal_Article" position="440–442">
    <ancestor level="bwmeta1.level.hierarchy_Journal_Journal" ref="bwmeta1.element.6d17104f-7ce8-4d2e-82c8-9c353345c8b6">
      <name>Nature</name>
    </ancestor>
    <ancestor level="bwmeta1.level.hierarchy_Journal_Volume" ref="bwmeta1.element.fd77fb98-0e86-4969-b5da-2570554ae774">
      <name>393</name>
    </ancestor>
    <ancestor level="bwmeta1.level.hierarchy_Journal_Issue" ref="bwmeta1.element.810510cd-f9cd-4ad7-8493-e5cabcb1c5a3">
      <name>6684</name>
    </ancestor>
  </structure>

  <contents>
    <file type="content" format="application/pdf" id="file.0f5bbc27-fc69-4a54-998d-e9fd47562768" size="279854">
      <location>yar://contents/0f5b/bc27/fc69/4a54/0f5bbc27-fc69-4a54-998d-e9fd47562768.pdf</location>
    </file>
    <file type="fulltext" format="text/plain" id="file.281badc6-82ae-4c60-8462-70ac55e52434" langs="en">
      <location>yar://fulltexts/281b/adc6/82ae/4c60/281badc6-82ae-4c60-8462-70ac55e52434.txt</location>
    </file>
  </contents>

  <affiliation id="aff.ef21a462-3a17-494f-b176-cccd20bb7c62">
    <text>Department of Theoretical and Applied Mechanics, Kimball Hall, Cornell University, Ithaca, New York 14853, USA</text>
  </affiliation>

  <contributor role="author">
    <person>
      <name type="canonical">Duncan J. Watts</name>
      <name type="surname">Watts</name>
      <name type="forenames">Duncan J.</name>
    </person>
    <affiliation ref="aff.ef21a462-3a17-494f-b176-cccd20bb7c62"/>
  </contributor>
  <contributor role="author">
    <person>
      <name type="canonical">Steven H. Strogatz</name>
      <name type="surname">Strogatz</name>
      <name type="forenames">Steven H.</name>
    </person>
    <affiliation ref="aff.ef21a462-3a17-494f-b176-cccd20bb7c62"/>
  </contributor>

  <id scheme="bwmeta1.id-class.DOI" value="10.1038/30918"/>
  <id scheme="bwmeta1.id-class.PMID" value="9623998"/>

  <relation type="reference">
    <attribute key="text"><value>1. Winfree, A. T. The Geometry of Biological Time (Springer, New York, 1980)."</value></attribute>
    <attribute key="parsed">
      <attribute key="type"><value>book</value></attribute>
      <attribute key="position"><value>1</value></attribute>
      <attribute key="authors">
        <attribute key="author">
          <value>A. T. Winfree</value>
          <attribute key="forenames"><value>A. T.</value></attribute>
          <attribute key="surname"><value>Winfree</value></attribute>
        </attribute>
      </attribute>
      <attribute key="title"><value>The Geometry of Biological Time</value></attribute>
      <attribute key="publisher"><value>Springer</value></attribute>
      <attribute key="city"><value>New York</value></attribute>
      <attribute key="year"><value>1980</value></attribute>
    </attribute>
  </relation>
  [...]

</element>
Personal tools