Archive for the 'XSLT' Category

GRDDL, Microformats

Posted in GRDDL, Microformats, RDF, Semantic Web, Technology, XSLT on August 15th, 2005 at 12:03:16

At some point, on the FOAFnet mailing list, David Sifry asked, ” What is DOAP? What is GRDDL? What’s the use case?” in response to Danny Ayers’ post on the mailing list.

My reply was pretty simple and succinct in my opinion, but I never put it anywhere public to read. I think it’s a relatively succinct explanation of why GRDDL can be useful for moving between the microformats, with tiny datasets, and RDF, with the large dataset behind it.

What is DOAP? What is GRDDL? What’s the use case?

DOAP - Description of a Project. DOAP is to Open Source software what FOAF is to people. http://usefulinc.com/doap

GRDDL - Gleaming Resource Descriptions from Dialects of Language. This is a way of performing transormations from (x)HTML documents to an RDF document. This transformation takes place via an XSLT transformation, specified either implicitly through the “profile” link in the <head> of the HTML document, or via a meta rel=”transformation”, which parses can then apply to the document to get something with real meaning out of.

http://www.w3.org/TeamSubmission/grddl/

Use Case:

“Microformats” and storing data in HTML may be good for people who are writing specialized tools for each format they want to deal with, but GRDDL allows for people who want to support all these to simply write one translation, using XSLT, and take advantage of the underlying data model of RDF. This allows you to merge existing data sources with the newly emergent small-s semantic technologies, and to use existing tools for combining these sets of data and discovering correlations between them, using the already existing RDF data access framework.

If I have a datastore of RDF data (which I do), and someone wants to find if the maintainer of a certain project has contact information in that database, using hBlah pages, I can’t really do that. However, if I merge the datasets with the already existing FOAF data out there, I can find out that a person named Christopher Schmidt, who is a maintainer of “julie, aka redlandbot”, also has an email address of crschmidt@crschmidt.net.

That’s a pretty simplified use case, but the general idea is simply to take the tiny sources of data that the h* formats provide, and integrate them with the millions of pieces of data out there already in FOAF, RSS, and everything else Semantic.

Building WebKit on Panther

Posted in Software, WebKit, XSLT on June 9th, 2005 at 00:26:48

I mentioned the other day the release of Apple’s WebKit, WebCore, and JavascriptCore (the latter two of which were already publicly available). Naturally, the first thing I wanted to do was download it and give it a try. This post will outline the steps I took to get as far as possible in the build process at this point. First, I would like to mentioned that this project has the cleanest build steps I have ever seen. It is well documented all the way through, and for Tiger users on Xcode 2.0, the build process went off without a hitch, the first time through. (Xcode 2.1 problems have since been fixed.) The members of the supporting IRC channel are helpful and intelligent, and the mailing list has already taken multiple patches from non-employees into the source tree. This is, quite simply, the best opening for an open source project that I have ever been aware of.

However, the build process currently favors those with Tiger, and the current CVS does not support those who are using Panther. Apple developers have expressed an interest in correcting this once the WWDC, being held this week in San Francisco, is over. So, I took it upon myself to report bugs in bugzilla where they are applicable, to help out developers when they get a chance to breathe.

First problem: Building returned a problem with “CarbonSound.h not available”. This was as a result of not yet installing the QuickTime 7 SDK. (It has been in software update, I just hadn’t touched it yet.) Updating fixed that.

Second Problem: 10.3.9 Build Failure: NSString may not respond to `+stringWithCString:encoding:’. This is a method which was not available in Panther. Maciej has said he is working on a patch to have this use CFString instead, where it is available. (I am tossing about some terms I don’t know here, so please excuse any incorrect terminology.) Workaround for the time being - copy the last two build commands before the crash (a cd line and a gcc-3.3 line) and past them, altering the gcc-3.3 line slightly to remove the -Werror. This means that it may cause problems later on, but will compile for the time being.

Third Problem: isnan failure in kjs_window.cpp: This one boggles me a bit, especially since (as I mention in the bug) there seems to be explicit knowledge in the code of the problem. However, a workaround is now offered in the bug in comment 1: replace using std::isnan; with extern “C” int isnan(double); This fixed the problem for me.

Fourth Problem: XSLT Headers not installed - This one is more systematic of the way that Apple releases updates, and is something that dajobe has brought up with building Redland in the past: “Headers don’t match libraries”. This is true here as well, but I now (thanks to toby from #webkit) know that the reason for this is that Apple does *not* ship updated headers with libraries updated through Software Update. Since libxslt is new in 10.3.9, there are no development headers. Dave Hyatt, of the WebKit team, mentioned that the whole team, when building on Panther, had to install libxslt and libxml from the source. Once I did this, it made this problem go away.

Fifth Problem: libxml headers are wrong - this was before I installed libxml, which also fixed this problem. It is, again, related to the fact that Apple does not update headers with System Update.

Once you get through these, you will have built both JavascriptCore and WebCore. Congratulations! You now have two completely useless frameworks which the new Webkit will depend on when you can build it! :)

WebKit is the previously unreleased Apple-specific Framework which is the “pretty” part of WebCore - it’s what ties everything together. It has a few more issues building on Panther, but most of them can be worked around by simply copy pasting build lines without the -Werror flag. (Note that this will produce possibly unstable results! These builds are not designed for production, and I do not advise doing this and filing bug reports on Safari crashing.)

npapi headers not available - for some reason, building on Panther does not find the appropriate headers from the in-process WebKit build. I really have no clue why this is, and neither did anyone else when I was building. My workaround was to copy the headers out of the framework and into ~/build/include (a directory I had to make), which was already on the path. cp ~/build/WebKit.Framework/Versions/A/Headers/* ~/build/include, cp ~/build/WebKit.Framework/Versions/A/PrivateHeaders/* ~/build/include, then continuing the build. I am not sure why this is neccesary, but it does seem to work.

Missing 10.4 Method -setCompositingOperation for WebImageRenderer - Two parts of the code require: (void)setCompositingOperation:(NSCompositingOperation)operation;
(NSCompositingOperation)compositingOperation; — this function was added in 10.4. This can be resolved by following the above -Werror removal steps. You will have to do this several times.

Missing 10.4 Method CFMakeCollectible - CFMakeColelctible is new in 10.4. Building with no -Werror allows the build to continue.

And, the current showstopper: Missing SecurityNssAsn1 headers — This comes from the libWebKitSystemInterface.a file, which is currently Tiger-specific. Once WWDC is over, a Panther binary file will be released. Until then, this is where the ride stops: you can build WebCore and JavascriptCore, but WebKit is out of your reach until you get your hands on Tiger.

Luckily for me, I’m going to be in Cupertino this weekend, so I’ll pick up a copy and get it installed soon ;)

Library in RDF

Posted in Delicious Library, RDF, Semantic Web, XSLT on June 5th, 2005 at 21:19:20

A long time ago, when I first got a Mac, there was a lot of hubbub about a program called “Delicious Library”: an application that would let you scan in your books, and provided an awesome user interface to searching, storing, lending, and everything else you might want to do with them. At the time, I wanted it, and I wanted it bad, but I decided to wait until I got an iSight: the idea of entering hundreds, perhaps up to a thousand, UPCs by hand, did not strike me as one of my favored tasks.

March 19th, I got an iSight: a birthday present, from Jess. I thought then “ooh, Delicious Library”, but never got around to it.

This weekend, I was starting to pack up books from the bookshelves. I thought “Hey, I won’t have a clue where any of the books are… unless…”

Jess was out of the house. I downloaded and tried the program: I scanned a full shelf of books (after getting some decent light) and packed them up before I hit my 25 limit and had to pay the piper. $40 for knowing where all of these books are after we move (as well as a new toy to play with) is well worth it.

I scanned another shelf (and ran out of boxes), then sat down to do the fun part.

First: xml2rdf - an XSLT stylesheet to convert from Delicious Library’s XML format to RDF. One of the biggest problems with this stylesheet is that it needs to know about the actual image files available from delicious library: this is where files.xml comes in, which is constructed using the following bash commands:

echo “<container>” > files.xml
for i in ~/Library/Application\ Support/Delicious\ Library/Images/Medium\ Covers/*; do
export j=`echo $i | sed -e ’s!.*/!!’`
echo “<image size=’medium’ name=’$j’ />” >> files.xml
done
echo “</container>” >> files.xml

This is then used with XSLT’s document() function in order to load what files are available, to prevent from inaccurate <foaf:depiction>s being spat into the source: Amazon does not store cover images for some books, so until I implemented this fix, there were broken image references.

Next: convert.py - Load the file as an RDF model, delete all the existing dc:description statements, convert them from rtfreader from Brandon’s Program Archive

Next: Process through cwm for RDF pretty printing.

Next: rdf2html - taking the RDF output and converting it to HTML.

End result? Content negotiated version of the books I’ve scanned so far in the Books Library - RDF and HTML versions available.

This is some of my first major experience in XSLT, and I’ve found it to be pretty darn easy: far less difficult than I thought it was in the past. I think that I may go on an XSLT kick for the next couple weeks, so don’t be surprised if you see a lot more of my RDF looking a little bit prettier. For example, I already wrote an XSLT stylesheet for the FIF reviews I’ve received, so if you’re using a capable browser, that will be a lot nicer looking now than it used to be.

Google Sitemap Format

Posted in RDF, Semantic Web, XSLT on June 3rd, 2005 at 10:02:23

Josh points out Google’s Sitemap Protocol, via the SWIG Chump. I pull out my XSLT-foo (what little of it there is). I hack a bit back and forth, run into a problem which uche helps me figure out: “XPath does *not* use the default prefix in the stylseheet for purposes of matching”, fix my XSLT up a bit, and create a new RDF source under my semweb section: Google Sitemap Tools, including an XSLT stylesheet, example output, and a conversion service which uses the XSLT: For example, Google’s Example File in RDF.

Now, to find some sitemaps in action in the real world, and add gzip decoding of gzipped sitemaps.

XSLT + Image Regions + Sparql

Posted in Flickr, Image Description, RDF, SPARQL, XSLT on May 22nd, 2005 at 20:05:23

Read Masahide’s notes on XSLT+Image Regions. Used some tools to convert my flickr photos to RDF.

Converted an XSLT Stylesheet to a different result format. Loaded ~400 RDF files into a Model, totalling 33,000 statements. Added an option to my Sparql Interface. Changed the default query. Made the extra option add the stylesheet.

Ran a query. Tweaked until it worked. Typed it all up here, to share with all of you.

Hooray for masahide, flickr, and all kinds of other wonderful things.