Turtle - Terse RDF Triple Language

Formerly N-Triples Plus

Dave Beckett

STATUS

This document has been replaced by an updated version Turtle - Terse RDF Triple Language on 2004-01-18.

Introduction

Turtle, a Terse RDF Triple Language, is an extension of N-Triples ([N-TRIPLES]) with the N most useful and appropriate things added from Notation 3 ([NOTATION3]) while keeping it in the RDF model.

Nearby New Syntaxes for RDF paper which discusses other RDF syntaxes and the background to Turtle (Submitted to WWW2004, referred to as N-Triples Plus there).

This document is a work in progress, feedback to me
Dave

Turtle Grammar

The N items added beyond N-Triples is currently 8:

  1. Whitespace restrictions removed
  2. Text content-encoding changed from ASCII to UTF-8
  3. @prefix
  4. QNames
  5. ,
  6. ;
  7. []
  8. a

This EBNF is the notation used in XML 1.0 second edition over an alphabet of [UNICODE] characters.

Turtle - Terse RDF Triple Language EBNF
ntriplesPlusDoc ::= statement*
statement ::= directive ws* '.' ws* | triples ws* '.' ws* | comment | ws+
directive ::= '@prefix' ws+ prefixID ws+ uriref
triples ::= subject ws+ predicateObjectList
Provides RDF triples using the given subject and each pair from the predicateObjectList
predicateObjectList ::= verb ws+ objectList ( ws+ ';' ws* verb ws+ objectList)*
Provides a sequence of (verb, object) pairs for each object from the objectList
objectList ::= object (ws+ ',' ws* object)*
Provides a sequence of objects
verb ::= predicate | 'a'
where 'a' is equivalent to qname rdf:type
comment ::= '#' ( character - ( #xD | #xA ) )*
subject ::= resource | blank
predicate ::= resource
object ::= resource | blank | literal
literal ::= langString | datatypeString
langString ::= '"' string '"' ( '@' language )?
datatypeString ::= '"' string '"^^' (uriref | qname)
blank ::= nodeID | '[]' | '[' ws* predicateObjectList ws* ']'
Provides a blank node either from the given nodeID, a generated one or a generated one which is also used to provide the subject of RDF triples for each pair from the predicateObjectList
resource ::= uriref | qname
nodeID ::= '_:' name
qname ::= name? ':' name?
prefixID ::= ':' | name ':'
uriref ::= '<' relativeURI '>'
language ::= [a-z]+ ('-' [a-z0-9]+ )* encoding a language tag.
name ::= [A-Za-z][A-Za-z0-9_]*
See section QNames
relativeURI ::= character* with escapes as defined in the N-Triples section 3.3 URI References. This is then used as a relative URI and resolved against the current base URI to give an absolute URI reference.
string ::= character* with escapes as defined in N-Triples section 3.2 Strings
ws ::= #x9 | #xA | #xD | #x20
character ::= Unicode character range in the range U+0 to U+10FFFF

Unicode strings

N-Triples only allows writing Unicode characters using \uHHHH and \UHHHHHHHH (see N-Triples section 3.2 Strings) With Turtle, UTF-8 allows them to be written in another form. Both are allowed and do not conflict since UTF-8 uses characters outside the base N-Triples US-ASCII range (128+)

QNames

The qname definition here isn't exactly the same as either XML or N3, since the overlap is not presently clear. It presently just adds _ over the ntriples name definition.

This is a minimal version and probably should import NCNAME from Namespaces in XML which uses the Unicode character classes. NCName with possibly some exclusions such as '-' and '.' that N3 uses for other forms. However that would add a dependency on XML that is not currently present.

A compatible definition could be used such as importing from the XML 1.1 drafts and removing the ':' from NameStartChar and the '-' and '.' from NameChar

i.e name :: = (NameStartChar - ':') (NameChar - '-' - '.')*

(Using the terms from the XML 1.1 Proposed Recommendation)

In long form, directly using the Unicode 3.0 character ranges

name ::= NameStartChar NameChar*

NameStartChar ::= [A-Z] | "_" | [a-z] | [#xC0-#xD6] |
[#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]

NameChar ::= NameStartChar | [0-9] | #xB7 | [#x0300-#x036F] |
[#x203F-#x2040]

MIME Type and Content Encoding

The mime type of Turtle is text/turtle (not yet registered) and the content encoding of Turtle is always UTF-8 even if there is a conflict between the content encoding delivered by a protocol element such as in HTTP's Content-Type: (Compare to N-Triples which is text/plain and always ASCII).

One possible change at the cost of more complexity would be to add an @charset directive that can only appear as the first line of the document like has been defined for CSS 2.1 in 4.4 CSS document representation and CSS 2 in 4.4 CSS document representation .

Possible Extensions

The next likely syntax form from N3 that could be added is the ( ... ) list form that creates RDF collections from the contained ordered sequence of nodes. It would mean the following changes:

blank ::= nodeID | '[]' | '[' predicateObjectList ']' | list

list ::= '(' resource* ')'

plus adding a description of the rather complex set of triples that are generated, such as by example:

( node1 node2 ) is short for
[ rdf:first node1; rdf:rest [ rdf:first node2; rdf:rest rdf:nil ] ]
and ( ) is short for
rdf:nil

Other possible extensions would be to add an @base uri to set the base URI in the same fashion as xml:base, and @language to set the default literal language for the following terms in the document.

References

[N-TRIPLES]
N-Triples section in RDF Test Cases, J. Grant and D. Beckett, Editors, World Wide Web Consortium Proposed Recommendation, work in progress, 15 December 2003. This version of the RDF Test Cases is http://www.w3.org/TR/2003/PR-rdf-testcases-20031215/. The latest version of the RDF Test Cases is at http://www.w3.org/TR/rdf-testcases/.
[NOTATION3]
Notation 3, Tim Berners-Lee, World Wide Web Consortium
[UNICODE]
The Unicode Standard Version 3.0, Addison Wesley, Reading MA, 2000, ISBN: 0-201-61633-5. This document is http://www.unicode.org/unicode/standard/standard.html.

Acknowledgements

This work was done under the Semantic Web Advanced Development Europe (SWAD-Europe) project funded by the EU IST-7 programme IST-2001-34732. and further development supported by the Institute for Learning and Research Technology at the University of Bristol, UK (2002-2004).


Copyright 2003-2004 Dave Beckett

Last Revised: $Date: 2004/01/07 22:43:43 $