Exhaustion with RDF

Posted in Semantic Web on June 21st, 2005 at 23:36:27

This is probably a familiar story to many of you who have been around a while, but I’ve lost a lot of my interest in working with the Semantic Web lately, and I don’t see it coming back anytime in the near future. For those of who are waiting on action items from me, I recommend removing them from my plate and putting them somewhere else where they are more likely to be taken care of.

There’s a few reasons for this. One is simply a lack of time: I’ve been working 14 hour days for the past two weeks at work, and that’s probably not going to change in the near future. Combine that with the fact that I need to do job searching as well, since we’ll be moving to Cambridge soon, and you’ve got an extreme amount of time going out the door to projects that aren’t my own.

Another is frustration with evangalizing being part of the process of proceeding in the Semantic Web world. Every time I take a step forward with some code, I find another 5 steps I have to take back in order to defend my position and the way I’ve done it. After doing this repeatedly for several months now, I’m growing tired of always having to spend more than half of my time fighting to defend the way I’ve created a certain project, rather than soliciting patches or getting help from the community.

Another is the lack of widespread support from the powers that could help move the RDF and Semantic Web movement forward. It would be relatively simple for something like IMDb to open up its database in an RDF format. This would allow for a widespread rating system to be created based around the datat that IMDb provides, allowing for a way for distributing information about movies that could be useful in a number of ways. Similarly for Netflix. Similarly for a half dozen other sites out there – but it never happens. Instead, they stick to their proprietary information, keeping everything internal. While this may generate more income for them, it hardly represents any interest in interacting with the community, which is what the Semantic Web needs in order to accelerate adoption.

I’ve had relatively little feedback on the projects I have put together. Things like rdfgpg, redlandbot, etc. all get left in the dust of the work of larger groups of people, with more impressive results (and rightly so). Nothing I’m doing is particularly innovative or interesting, and it shows in the response from the community.

There is much more motivation for people behind things like microformats – something that’s close to RDF, but far enough away (and unlikely to see transformation to it) that it seems pointless. People are trying to create all these tools that take advantage of the small-s semantic web, but not taking the one extra step needed – via GRDDL, profiles, whatever. They think they’re writing the new version of the SemWeb, when in reality, they’re just creating an incomplete imitation.

I suppose at some point, people will start to come around. The world of RDF is powerful. The world of HTML is not. Trying to create semantics out of a language that has none will not work in the long run. For right now, however, people are convinced it will, and that leaves most of the work I’ve done behind as people hop onto the next bandwagon.

I’m going to try and clean up some of the code I have, document it fully, and get licenses attached to it, so that people who want to use it or maintain it can take it up. This is especially true for Julie, which is kind of my pride and joy as far as code goes.

I typically move my interests in about 6 month cycles, so I may eventually swing back towards semantic web development. For now, however, I’m going to do my best to wrap things up, and move onto something different, where I don’t have to fight every step of the way to get things that I do acknowledged.

Microsoft Blocking Access via User-Agent

Posted in Social on June 9th, 2005 at 13:14:34

Earlier today, someone gave me a link to a file on Microsoft.com. Since the WebKit release, I’ve been using Safari much more (even though it’s not any faster – I haven’t yet built the new WebKit) because I was reminded that it is much less of a memory hog than Firefox has been lately for me.

I tried to open the link, a download page for some driver and received an error : “Sorry, we are unable to show you the page you requested. Please try again later.”

I tried in Firefox: worked fine.

Turns out that using a User-Agent with either “Safari/312” or “AppleWebKit” in it is enough for Microsoft to not share these files with you. Not only that, but it seems to apply to any files in their download area.

Mostly I’m just curious why they would bother.

San Francisco Trip

Posted in Mobile Platform, Semantic Web, Social on June 9th, 2005 at 01:52:15

For those of you who are not yet aware, I will be in San Francisco this weekend, arriving Thursday night (late) and leaving Early Sunday afternoon. I will be in meetings all day on Friday, but if anyone is interested in meeting up, let me know.

People I plan to see so far include, but are not neccesarily limited to: Neil, twid, leora, miker and wombatmobile (possibly) from #mobitopia. I plan to visit tourist sites, as well as stopping by The Mothership in Cupertino while I’m there. I want to ride the famous Trolley’s, I want to eat tacos in the Mission district, I want to visit Unicorn Precinct XIII (note to self, poke zool to fix sf.openguides).

What else should I be doing? Should I go to the DNA Lounge? Muir Woods Redwoods?

Advise me, dear reader, as to what you would do if you were in San Francisco for 36 hours with nothing else on your todo list! Tell me if you want to meet me, and talk about the next hack for the Semantic Web! Tell me if you want to meet me and berate me for not working on location based cell phone computing! Tell me your thoughts on my work, tell me what you’d like to cook up next. Point me out the coolest things in and around downtown San Francisco, and come with me to see them.

The rest is up to you.

Building WebKit on Panther

Posted in Software, WebKit, XSLT on June 9th, 2005 at 00:26:48

I mentioned the other day the release of Apple’s WebKit, WebCore, and JavascriptCore (the latter two of which were already publicly available). Naturally, the first thing I wanted to do was download it and give it a try. This post will outline the steps I took to get as far as possible in the build process at this point. First, I would like to mentioned that this project has the cleanest build steps I have ever seen. It is well documented all the way through, and for Tiger users on Xcode 2.0, the build process went off without a hitch, the first time through. (Xcode 2.1 problems have since been fixed.) The members of the supporting IRC channel are helpful and intelligent, and the mailing list has already taken multiple patches from non-employees into the source tree. This is, quite simply, the best opening for an open source project that I have ever been aware of.

However, the build process currently favors those with Tiger, and the current CVS does not support those who are using Panther. Apple developers have expressed an interest in correcting this once the WWDC, being held this week in San Francisco, is over. So, I took it upon myself to report bugs in bugzilla where they are applicable, to help out developers when they get a chance to breathe.

First problem: Building returned a problem with “CarbonSound.h not available”. This was as a result of not yet installing the QuickTime 7 SDK. (It has been in software update, I just hadn’t touched it yet.) Updating fixed that.

Second Problem: 10.3.9 Build Failure: NSString may not respond to `+stringWithCString:encoding:’. This is a method which was not available in Panther. Maciej has said he is working on a patch to have this use CFString instead, where it is available. (I am tossing about some terms I don’t know here, so please excuse any incorrect terminology.) Workaround for the time being – copy the last two build commands before the crash (a cd line and a gcc-3.3 line) and past them, altering the gcc-3.3 line slightly to remove the -Werror. This means that it may cause problems later on, but will compile for the time being.

Third Problem: isnan failure in kjs_window.cpp: This one boggles me a bit, especially since (as I mention in the bug) there seems to be explicit knowledge in the code of the problem. However, a workaround is now offered in the bug in comment 1: replace using std::isnan; with extern “C” int isnan(double); This fixed the problem for me.

Fourth Problem: XSLT Headers not installed – This one is more systematic of the way that Apple releases updates, and is something that dajobe has brought up with building Redland in the past: “Headers don’t match libraries”. This is true here as well, but I now (thanks to toby from #webkit) know that the reason for this is that Apple does *not* ship updated headers with libraries updated through Software Update. Since libxslt is new in 10.3.9, there are no development headers. Dave Hyatt, of the WebKit team, mentioned that the whole team, when building on Panther, had to install libxslt and libxml from the source. Once I did this, it made this problem go away.

Fifth Problem: libxml headers are wrong – this was before I installed libxml, which also fixed this problem. It is, again, related to the fact that Apple does not update headers with System Update.

Once you get through these, you will have built both JavascriptCore and WebCore. Congratulations! You now have two completely useless frameworks which the new Webkit will depend on when you can build it! 🙂

WebKit is the previously unreleased Apple-specific Framework which is the “pretty” part of WebCore – it’s what ties everything together. It has a few more issues building on Panther, but most of them can be worked around by simply copy pasting build lines without the -Werror flag. (Note that this will produce possibly unstable results! These builds are not designed for production, and I do not advise doing this and filing bug reports on Safari crashing.)

npapi headers not available – for some reason, building on Panther does not find the appropriate headers from the in-process WebKit build. I really have no clue why this is, and neither did anyone else when I was building. My workaround was to copy the headers out of the framework and into ~/build/include (a directory I had to make), which was already on the path. cp ~/build/WebKit.Framework/Versions/A/Headers/* ~/build/include, cp ~/build/WebKit.Framework/Versions/A/PrivateHeaders/* ~/build/include, then continuing the build. I am not sure why this is neccesary, but it does seem to work.

Missing 10.4 Method -setCompositingOperation for WebImageRenderer – Two parts of the code require: (void)setCompositingOperation:(NSCompositingOperation)operation;
(NSCompositingOperation)compositingOperation; — this function was added in 10.4. This can be resolved by following the above -Werror removal steps. You will have to do this several times.

Missing 10.4 Method CFMakeCollectible – CFMakeColelctible is new in 10.4. Building with no -Werror allows the build to continue.

And, the current showstopper: Missing SecurityNssAsn1 headers — This comes from the libWebKitSystemInterface.a file, which is currently Tiger-specific. Once WWDC is over, a Panther binary file will be released. Until then, this is where the ride stops: you can build WebCore and JavascriptCore, but WebKit is out of your reach until you get your hands on Tiger.

Luckily for me, I’m going to be in Cupertino this weekend, so I’ll pick up a copy and get it installed soon 😉

Lesser GPL

Posted in Licenses on June 8th, 2005 at 22:58:26

Earlier today, I was reading some of the discussion of the KHTML/WebKit discussions, and reading through what KHTML developers had said about Apple’s lack of followthrough, only doing the minimal amount neccesary legally to comply with the LGPL license. I was most interested in what requirements Apple has under the LGPL license to the KHTML community.

In the process of reading this license, I found out that it is completely ridiculous. Some examples:

Section 2d: If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful.

This seems like some kind of really strange way of saying that a library must provide valid output, even with missing input. I’m not even sure I understand what this is – it seems almost like an indication that you are not to break reverse compatibility in the libraries that are LGPL licensed. I’m sorry, but that (to me, at least) seems like a flaming pile of crap. If someone wants to use an API in an application, it’s up to that developer to ensure that it passes the correct values.

Then for the cases where you are delivering an application linked to an LGPL library:

Section 6c: c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution.

I have to accompany all works I distribute which are linked against LGPL libraries an offer to the source code for at least three years? It’s my job to maintain the version of XSLT included with every one of my applications for three years after I distribute them, just in case libxml.org goes away? I understand the idea – people should be able to modify the library code behind an Application, so they should have access to that code – but in the case of most of these libraries, I am not going to take the time and effort to maintain a copy of the libraries. That’s what package management is for.

The rest of the license is understandable at least, but for small time projects, these kind of requirements are ridiculous, and I find it really difficult to believe that people use this license. I’m sure that other people think it makes perfect sense, but I’m really just thinking that the use of LGPL is something that I’d never want to see or encourage.

WebKit Source

Posted in Social, Software, WebKit on June 7th, 2005 at 08:19:41

An announcement on the release of Webkit, the source of the rendering engine for the popular OS X browser, Safari. Includes mention of #webkit on irc.freenode.net, for discussion of webkit, and information on how to get anon CVS access.

Currently requires XCode to build, but I’m sure that someone out there will cook up some autotools goodness for it sometime soon.

Keep in mind that (as far as I know) this isn’t the actual shell that makes up Safari. It’s the source of the rendering engine inside it – basically, the bits that were taken from KHTML. I’m not sure though, and I can’t read the code well enough to confirm that I think that. However, one of the parts that is being released is WebKit: the interface that people have used in the past to make 10 line browsers in Xcode projects. This could mean we’ll see a lot more similar projects for other UNIXes – with the rendering taken care of and a simple binding, it becomes much simpler to write applications which display HTML.

Certainly an interesting development. Could this mean we’ll see a Safari-like browser base on other platforms in the near future? My bet is yes.

Library in RDF

Posted in Delicious Library, RDF, Semantic Web, XSLT on June 5th, 2005 at 21:19:20

A long time ago, when I first got a Mac, there was a lot of hubbub about a program called “Delicious Library”: an application that would let you scan in your books, and provided an awesome user interface to searching, storing, lending, and everything else you might want to do with them. At the time, I wanted it, and I wanted it bad, but I decided to wait until I got an iSight: the idea of entering hundreds, perhaps up to a thousand, UPCs by hand, did not strike me as one of my favored tasks.

March 19th, I got an iSight: a birthday present, from Jess. I thought then “ooh, Delicious Library”, but never got around to it.

This weekend, I was starting to pack up books from the bookshelves. I thought “Hey, I won’t have a clue where any of the books are… unless…”

Jess was out of the house. I downloaded and tried the program: I scanned a full shelf of books (after getting some decent light) and packed them up before I hit my 25 limit and had to pay the piper. $40 for knowing where all of these books are after we move (as well as a new toy to play with) is well worth it.

I scanned another shelf (and ran out of boxes), then sat down to do the fun part.

First: xml2rdf – an XSLT stylesheet to convert from Delicious Library’s XML format to RDF. One of the biggest problems with this stylesheet is that it needs to know about the actual image files available from delicious library: this is where files.xml comes in, which is constructed using the following bash commands:

echo “<container>” > files.xml
for i in ~/Library/Application\ Support/Delicious\ Library/Images/Medium\ Covers/*; do
export j=`echo $i | sed -e ‘s!.*/!!’`
echo “<image size=’medium’ name=’$j’ />” >> files.xml
done
echo “</container>” >> files.xml

This is then used with XSLT’s document() function in order to load what files are available, to prevent from inaccurate <foaf:depiction>s being spat into the source: Amazon does not store cover images for some books, so until I implemented this fix, there were broken image references.

Next: convert.py – Load the file as an RDF model, delete all the existing dc:description statements, convert them from rtfreader from Brandon’s Program Archive

Next: Process through cwm for RDF pretty printing.

Next: rdf2html – taking the RDF output and converting it to HTML.

End result? Content negotiated version of the books I’ve scanned so far in the Books Library – RDF and HTML versions available.

This is some of my first major experience in XSLT, and I’ve found it to be pretty darn easy: far less difficult than I thought it was in the past. I think that I may go on an XSLT kick for the next couple weeks, so don’t be surprised if you see a lot more of my RDF looking a little bit prettier. For example, I already wrote an XSLT stylesheet for the FIF reviews I’ve received, so if you’re using a capable browser, that will be a lot nicer looking now than it used to be.

Feedback in Feeds

Posted in Semantic Web, Web Publishing on June 5th, 2005 at 08:41:00

Some of you may have noticed if you’re regular subscribers that I’m currently working on the Feedback In-Feed that I added a few days back: specifically, trying to make it less obtrusive. However, one of the things that I didn’t realize in all my efforts is that I almost always already *have* the user’s homepage information: if they’ve left a comment on the site, it’s stored in a cookie in their browser (which is probably what they’re going to be submitting the form through).

“But”, I hear the audience crying, “You can’t see their cookies! They’re posting the form from their aggregator!”

Ah, my feeble minded friends, this is true, but this is unimportant. What matters is the user agent which hits the final form – which is stored (now) in the same directory as this blog, meaning it has access to all the cookies set by this blog. Including, as it happens, the name and URL of the person, assuming they’ve left a comment here before.

Of course, this has some limitations: persons who have not commented here before will not have a homepage set, or whose cookies have expired. (They expire after one year in WordPress, it seems.) Still, at the cost of saving space, I have gone the route of removing the homepage from the form, as well as pulling out some of the HTML I didn’t really like in the first place, and cleaning it up in general. I’ve been working on this over the past few days, in large part with help from jc, who was helping me figure out ways to make controls smaller. He said that my feedback form was “Somehow even more annoying than Adsense in feeds” – something that I was loath to admit, but in the end had to agree with.

Since I know there are a large number of people who read this who never comment, there is also a built in “more info” link, via which users can set their Name/URI. So, if you have any interest, feel free to use that link to set your name/URI, which will then be stored with the data.

Another thing that I’m doing, which I hadn’t yet written or talked about in public yet (despite Danny’s conviction otherwise) is the fact that I am capturing the referer information for the feedback, and exposing it via the RDF interface, attached to the review. This is a more generally useful property: a review for something can come via Amazon, a blog, or any other of a number of things. I’m simply using it to store the referer information: So I know whether someone came in via Planet Swhack, Planet RDF, Planet Mobile, LiveJournal, Bloglines, or any other of a number of web resources. Sometimes, of course, there is no referer information, in which case it was probably an aggregator, but I haven’t gotten far enough to analyze that yet. Unfortunately, User-Agent in most of these cases probably doesn’t make a bit of difference: The form is going to be posted via the browser, not the aggregator.

Danny advocates an extremely minimalist feedback mechanism, but I think that that’s less likely to get people to submit feedback, especially once they realize he asks for more information afterwards. LiveJournal’s polls always get more feedback than comments, because they’re low impact in comparison. The same idea applies here: but something which is just a tickybox or a simple form press is not enough to provide the user with a sense of offering helpful feedback. I hope that I’m achieving (for the most part) low impact as well, by redirecting to the referer once the form is posted – but I want to have low impact, informative, useful feedback. I think the recent design changes will help that.

I do think that this is the right idea, and that implementation is the problem, rather than anything more basic in the idea. So, I’ll be working with the implementation to make it better as time goes on.

Symbian Python Update

Posted in Flickr, Mobile Platform, Python, Symbian Python on June 4th, 2005 at 10:08:49

Matt Croyden mentioned the other day that there is a new Python for Series 60 Alpha release. Reading through Erik Smartt’s post on the topic, I realized that this offers a number of the features that I had wanted built in in the original release: Camera access, Address Book and Contact APIs, and other similar things.

I had put off working on Symbian Python work for a while, but with the new release, I think I’m going to put some more effort into it: use of the new APIs will make things easier (like automatically uploading pictures taken to flickr, one of my original goals) and makes me want to get hacking again.

Congratulations to the Python-on-Symbian team, and I’m looking forward to starting work with the new alpha release.

Google Sitemap Format

Posted in RDF, Semantic Web, XSLT on June 3rd, 2005 at 10:02:23

Josh points out Google’s Sitemap Protocol, via the SWIG Chump. I pull out my XSLT-foo (what little of it there is). I hack a bit back and forth, run into a problem which uche helps me figure out: “XPath does *not* use the default prefix in the stylseheet for purposes of matching”, fix my XSLT up a bit, and create a new RDF source under my semweb section: Google Sitemap Tools, including an XSLT stylesheet, example output, and a conversion service which uses the XSLT: For example, Google’s Example File in RDF.

Now, to find some sitemaps in action in the real world, and add gzip decoding of gzipped sitemaps.