about semantic web, software architecture and life in general

Archives for: February 2011, 21

2011-02-21

Permalink 01:31:47, Categories: Semantic Web, Software Development   English (EU)

NQuad parsing using Jython

When in need to parse NQuad RDF files (e.g., the Billion Triples Challenge data files) Java folks can use the NxParser by Aidan Hogan and Andreas Harth: NxParser - Parser for NTriples, NQuads, and more.

You can also use it from Python (provided you use the Jython implementation).

Code:

import sys
sys.path.append("./nxparser.jar")
 
from org.semanticweb.yars.nx.parser import *
from java.io import FileInputStream
from java.util.zip import GZIPInputStream
 
def all_triples(fname, use_gzip=False):
    in_file = FileInputStream(fname)
    if use_gzip:
        in_file = GZIPInputStream(in_file)
 
    nxp = NxParser(in_file, False)
 
    while nxp.hasNext():
        triple = nxp.next()
        n3 = ([i.toN3() for i in triple])
        yield n3

The code above defines a generator function which will yield a stream of NQuad records. We can now add some demo code in order to see it in action:

Read more! »

captsolo weblog

See also:

February 2011
Mon Tue Wed Thu Fri Sat Sun
 << < Current> >>
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28            
Last comments

Search

Gallery

www.flickr.com
captsolo's items Go to captsolo's photostream

Misc

Syndicate this blog XML

powered by
b2evolution
Page served in 0.431 seconds

Valid XHTML 1.0! Valid CSS! Valid RSS! Valid Atom!