Formerly N-Triples Plus
This document has been replaced by an updated version Turtle - Terse RDF Triple Language on 2004-01-18.
Turtle, a Terse RDF Triple Language, is an extension of N-Triples ([N-TRIPLES]) with the N most useful and appropriate things added from Notation 3 ([NOTATION3]) while keeping it in the RDF model.
Nearby New Syntaxes for RDF paper which discusses other RDF syntaxes and the background to Turtle (Submitted to WWW2004, referred to as N-Triples Plus there).
This document is a work in progress, feedback to me
Dave
The N items added beyond N-Triples is currently 8:
@prefix
,
;
[]
a
This EBNF is the notation used in XML 1.0 second edition over an alphabet of [UNICODE] characters.
ntriplesPlusDoc | ::= | statement* |
statement | ::= | directive ws* '.' ws* | triples ws* '.' ws* | comment | ws+ |
directive | ::= | '@prefix' ws+ prefixID ws+ uriref |
triples | ::= | subject ws+ predicateObjectList Provides RDF triples using the given subject and each pair from the predicateObjectList |
predicateObjectList | ::= | verb ws+ objectList ( ws+ ';' ws* verb ws+ objectList)* Provides a sequence of (verb, object) pairs for each object from the objectList |
objectList | ::= | object (ws+ ',' ws* object)* Provides a sequence of objects |
verb | ::= | predicate | 'a' where 'a' is equivalent to qname rdf:type |
comment | ::= | '#' ( character - ( #xD | #xA ) )* |
subject | ::= | resource | blank |
predicate | ::= | resource |
object | ::= | resource | blank | literal |
literal | ::= | langString | datatypeString |
langString | ::= | '"' string '"' ( '@' language )? |
datatypeString | ::= | '"' string '"^^' (uriref | qname) |
blank | ::= | nodeID | '[]' | '[' ws* predicateObjectList ws* ']' Provides a blank node either from the given nodeID, a generated one or a generated one which is also used to provide the subject of RDF triples for each pair from the predicateObjectList |
resource | ::= | uriref | qname |
nodeID | ::= | '_:' name |
qname | ::= | name? ':' name? |
prefixID | ::= | ':' | name ':' |
uriref | ::= | '<' relativeURI '>' |
language | ::= | [a-z]+ ('-' [a-z0-9]+ )* encoding a language tag. |
name | ::= | [A-Za-z][A-Za-z0-9_]* See section QNames |
relativeURI | ::= | character* with escapes as defined in the N-Triples section 3.3 URI References. This is then used as a relative URI and resolved against the current base URI to give an absolute URI reference. |
string | ::= | character* with escapes as defined in N-Triples section 3.2 Strings |
ws | ::= | #x9 | #xA | #xD | #x20 |
character | ::= | Unicode character range in the range U+0 to U+10FFFF |
N-Triples only allows writing Unicode characters using
\u
HHHH and \U
HHHHHHHH
(see N-Triples section 3.2 Strings)
With Turtle, UTF-8 allows them to be written in another form.
Both are allowed and do not conflict since UTF-8 uses characters
outside the base N-Triples US-ASCII range (128+)
The qname definition here isn't exactly the
same as either XML or N3, since the overlap is not presently clear.
It presently just adds _
over the ntriples name
definition.
This is a minimal version and probably should import NCNAME from
Namespaces in XML
which uses the Unicode character classes.
NCName
with possibly some exclusions such as '-
' and
'.
' that N3 uses for other forms. However that would
add a dependency on XML that is not currently present.
A compatible definition could be used such as importing from the
XML 1.1 drafts and removing the ':
' from NameStartChar
and the '-
' and '.
' from NameChar
i.e name :: = (NameStartChar - ':') (NameChar - '-' - '.')*
(Using the terms from the XML 1.1 Proposed Recommendation)
In long form, directly using the Unicode 3.0 character ranges
name ::= NameStartChar NameChar*
NameStartChar ::= [A-Z] | "_" | [a-z] | [#xC0-#xD6] |
[#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]
NameChar ::= NameStartChar | [0-9] | #xB7 | [#x0300-#x036F] |
[#x203F-#x2040]
The mime type of Turtle is text/turtle (not yet registered) and the content encoding of Turtle is always UTF-8 even if there is a conflict between the content encoding delivered by a protocol element such as in HTTP's Content-Type: (Compare to N-Triples which is text/plain and always ASCII).
One possible change at the cost of more complexity would be
to add an @charset
directive that can only
appear as the first line of the document like has been defined for
CSS 2.1 in
4.4 CSS document representation
and
CSS 2 in
4.4 CSS document representation
.
The next likely syntax form from N3 that could be added
is the ( ... )
list form that creates RDF
collections from the contained ordered sequence of nodes.
It would mean the following changes:
blank ::= nodeID | '[]' | '[' predicateObjectList ']' | list
list ::= '(' resource* ')'
plus adding a description of the rather complex set of triples that are generated, such as by example:
( node1 node2 ) is short for
[ rdf:first
node1;
rdf:rest
[ rdf:first
node2;
rdf:rest
rdf:nil
] ]
and ( ) is short for
rdf:nil
Other possible extensions would be to add an @base
uri to set the base URI in the same fashion as xml:base
,
and @language
to set the default literal language for
the following terms in the document.
This work was done under the Semantic Web Advanced Development Europe (SWAD-Europe) project funded by the EU IST-7 programme IST-2001-34732. and further development supported by the Institute for Learning and Research Technology at the University of Bristol, UK (2002-2004).
Copyright 2003-2004 Dave Beckett
Last Revised: $Date: 2004/01/07 22:43:43 $