Archive for the 'Technology' Category

Ning!

Posted in Ning, PHP, Technology, Web Publishing on October 4th, 2005 at 05:03:05

For the past 4 weeks or so, I’ve been working on a project known previoiusly as 24 Hour Laundry.

Now, it’s no longer 24HL: Welcome Ning.

A development playground with all kinds of neat and nifty toys, Ning is attempting to do to application and code sharing what other apps have done to photos, bookmarks or other arenas. Allowing people to clone, mix, and create new apps.

There’s a lot of cool things here, and I’ve got a pretty bad headache, so I’m not going to be able to cover all the things that I would like to here, but here’s some of the cooler things about the site:

* System wide content store. Public content which is created can be accessed by any application. This content store is well abstracted, and has a content creation and query system. You don’t have to worry about scaling up: You can leave that to the professionals in the backend. At the same time, you can collect data from all the other apps in the playground. You want to create a book reviews site? First, grab everything that’s known as a Book from the site, and then use the built in classes for ratings and comments to build a discussion board. The possibilities for content mix and match are really spectacular. However, if you don’t want others touching your data, you can mark it as “private” and use it only in your app – but why would you want to?
* Built in classes for lots of things. Build a calendar. Interact with Flickr. Make a GMap. Talk to Amazon. The code’s all done for you, you just use it. Bookshelf makes extensive use of the Amazon classes, Restaurant Reviews With Maps uses Google Maps to show where you’re going — Bay Area Hiking Trails shows you how to get there.
* RSS feeds of content. The Ning Pivot is a really cool way of looking at the content flowing by, but not only can you watch it, you can watch it flow by.

There’s about a half doezn other really nifty things here that I can’t even think of at the moment because it’s 5am and I’ve been walking like a Zombie for two weeks to get this stuff complete.

But the coolest thing is:
* All data added is placed under CC By-SA license. (If you don’t like this, ning isn’t for you.)
* All app code is completely open, and you can make it your own in 2 seconds.

Screw Ruby on Rails: who needs a 2 minute app, when you can write a 2 second app? All depends on how fast you can click.

If you run into problems with ning, feel free to drop them here: You can never fix all the bugs before release, but I think that the team working on Ning has done an absolutely incredible job with all the work they’ve put together here. I’ll pass them on as best as possible.

There’s a lot of other stuff I want to write — one that others here might find interest in is how similar Ning’s content store is to RDF, and why I think that there’s no functional difference. Of course, Marc and I got into a nice “discussion” on that one on IRC the other night, so maybe I’ll wait til I’m a bit less exhausted and can adequately express my points on the topic. 🙂

New Colors, New Features

Posted in Mobile Platform, Semantic Web, Technology on September 26th, 2005 at 12:10:58

crschmidt.net now features a new colorscheme: I’m still not sure how much I like it, but the old black/grey/white scheme was really starting to grate on me. (Note that the weblog uses a different stylesheet which I haven’t updated yet.)

Additionally, all pages now have a feature to allow commenting from users. So you can now leave a comment on any page! This is taken from Eikeon’s websites, which have this feature (although it requires logging in first). I’ve done some very basic escaping of script tags, and I do my best to add newlines if they are appropriate, but if you want to make your content look right, you’re best off just formatting it with HTML yourself.

However, this means that it’s really easy to offer feedback on any page of the site now. If you’re interested in my semantic web tools, you can leave comments on the various ones there. You can comment on the code for any of my Python tools, on my symbian stuff, on pretty much anything. Soon, and very soon, I’ll be writing an RSS feed generator for this. Right now I’m just happy it works, and would love to see people commenting on any page on the site they’d like more info about or would like to offer feedback on.

irssi word completion

Posted in Technology on August 27th, 2005 at 04:22:04

Every now and then, I’ll try and type a difficult to type word on IRC, and curse the lack of auto-complete built into my IRC client. I’ve always thought “I should really look into fixing that.” Well, tonight I was sleepy and browsing through the entire list of irssi scripts (obtained via `rsync -avz main.irssi.org::irssiweb/scripts/scripts/\*.pl ~/.irssi/scripts/official’`), and I discovered that there is a “wordcompletion” script, which pulls data from a MySQL database.

“Nifty!” I thought, and poked at it a bit more, finding that it simply stored words you used in messages into a MySQL database. So, I got to thinking. Wouldn’t it be nice to take the words from /usr/share/dict and dump them into there?

So I did.

for i in `cat /usr/share/dict/american-english"`; do export v=`echo $i | perl -pe "s/'/\\\\\\\\'/"`; echo $v; echo "INSERT INTO words (word, prio) VALUES ('$v', 1)" |mysql -u irssi -pPASSHERE irssi ; done

And since I did it, I saved you the work: You can fetch the entire database dump (in compact, minimal impact one-insert form) from odds and ends, a new section on crschmidt.net. Additionally, you can grab my new version of the script from there, which changes the script to read all messages rather than just ones which were typed by you. In the process, I became interested enough to work out how to store these fields in a setting – the new version of the script features a number of improvements, such as saving the database password, user, and dsn in a setting, as well as offering help, so people who don’t know Perl enough to even change simple variables can use it.

I’ve contacted the author to let him know about these changes so he can roll them into the official version if he wishes. If I don’t hear back within a week, I’ll submit my version as an update to the original script at irssi.org.

Programs which are easy to script make a great wya to keep yourself occupied late at night, and let you occasionally release something that seems impressive which otherwise wouldn’t. Thanks to the original author of the script (Jesper Lindh) as well as the authors of all irssi scripts for their help in getting this one out the door.

SVG::Metadata 0.28 Released

Posted in RDF, Semantic Web, SVG on August 22nd, 2005 at 22:03:52

While many people these days are switching to annotation-in-XHTML, there’s still at least one file format out there which has extremely useful metadata annotation using RDF/XML inside the document: SVG.

The Scalable Vector Graphcs format has a Metadata element, which is expected to contain RDF/XML. This is great news for people who might wish to create a directory of SVG images: the metadata can be stored in the actual images, something that the Open Clip Art Library takes advantage of, using a number of tools to extract statistics and aggregate metadata from SVG files.

To take an example from the library, Autos_01.svg (SVG file, requires SVG viewer) contains 23 RDF statements. These triples are given a base of a cc:Work with the URL of the file of itself, meaning that a simple query about the predicates and objects with http://openclipart.org/clipart/transportation/autos_01.svg as a subject returns the important aspects of this document. This includes description, creator, keywords, and license. The license is “Public Domain” — adding the images to the Open Clip Art Library requires placing them into the Public Domain.

For working with this data, developers of the project created the Perl module SVG::Metadata – a module for annotating SVG files with this metadata, as well as making change to the metadata which already exists in such files.

The maintainer just announced on the Clipart Discussion list that he has released 0.28, which includes the changes from previous releases 0.26 and 0.27 which were mostly maintenance releases. (The message will eventually appear in the August threads, but hasn’t yet.)

The RDF generation in versions prior to 0.24 was broken, but was fixed in the 0.25 release – OCAL is now using this release in their scripts, so many of the more recent images in the library are valid RDF, meaning that you can simply pass it to Redland with the http://feature.librdf.org/raptor-scanForRDF feature set. In the Python bindings, that is:

p.set_feature(”http://feature.librdf.org/raptor-scanForRDF”, “1″)

In rapper:

[crschmidt@creusa ~]$ rapper -c -f scanForRDF=1 http://www.openclipart.org/incoming/cat_scrathing_post_benji_01.svg
rapper: Parsing URI http://www.openclipart.org/incoming/cat_scrathing_post_benji_01.svg
rapper: Parsing returned 30 statements

I think this is a great example of how to work with structured metadata without dealing with the crappy aspects of RDF/XML syntax corner cases: simply write a library which parses the metadata, fills your variables up, and lets you modify them with a standard API, then lets you resync the data to the file. Congrats to Bryce for his hard work on the module, and on making the metadata for these SVG files accurate and useful to external users.

GRDDL, Microformats

Posted in GRDDL, Microformats, RDF, Semantic Web, Technology, XSLT on August 15th, 2005 at 12:03:16

At some point, on the FOAFnet mailing list, David Sifry asked, ” What is DOAP? What is GRDDL? What’s the use case?” in response to Danny Ayers’ post on the mailing list.

My reply was pretty simple and succinct in my opinion, but I never put it anywhere public to read. I think it’s a relatively succinct explanation of why GRDDL can be useful for moving between the microformats, with tiny datasets, and RDF, with the large dataset behind it.

What is DOAP? What is GRDDL? What’s the use case?

DOAP – Description of a Project. DOAP is to Open Source software what FOAF is to people. http://usefulinc.com/doap

GRDDL – Gleaming Resource Descriptions from Dialects of Language. This is a way of performing transormations from (x)HTML documents to an RDF document. This transformation takes place via an XSLT transformation, specified either implicitly through the “profile” link in the <head> of the HTML document, or via a meta rel=”transformation”, which parses can then apply to the document to get something with real meaning out of.

http://www.w3.org/TeamSubmission/grddl/

Use Case:

“Microformats” and storing data in HTML may be good for people who are writing specialized tools for each format they want to deal with, but GRDDL allows for people who want to support all these to simply write one translation, using XSLT, and take advantage of the underlying data model of RDF. This allows you to merge existing data sources with the newly emergent small-s semantic technologies, and to use existing tools for combining these sets of data and discovering correlations between them, using the already existing RDF data access framework.

If I have a datastore of RDF data (which I do), and someone wants to find if the maintainer of a certain project has contact information in that database, using hBlah pages, I can’t really do that. However, if I merge the datasets with the already existing FOAF data out there, I can find out that a person named Christopher Schmidt, who is a maintainer of “julie, aka redlandbot”, also has an email address of crschmidt@crschmidt.net.

That’s a pretty simplified use case, but the general idea is simply to take the tiny sources of data that the h* formats provide, and integrate them with the millions of pieces of data out there already in FOAF, RSS, and everything else Semantic.

Building WebKit on Panther

Posted in Software, WebKit, XSLT on June 9th, 2005 at 00:26:48

I mentioned the other day the release of Apple’s WebKit, WebCore, and JavascriptCore (the latter two of which were already publicly available). Naturally, the first thing I wanted to do was download it and give it a try. This post will outline the steps I took to get as far as possible in the build process at this point. First, I would like to mentioned that this project has the cleanest build steps I have ever seen. It is well documented all the way through, and for Tiger users on Xcode 2.0, the build process went off without a hitch, the first time through. (Xcode 2.1 problems have since been fixed.) The members of the supporting IRC channel are helpful and intelligent, and the mailing list has already taken multiple patches from non-employees into the source tree. This is, quite simply, the best opening for an open source project that I have ever been aware of.

However, the build process currently favors those with Tiger, and the current CVS does not support those who are using Panther. Apple developers have expressed an interest in correcting this once the WWDC, being held this week in San Francisco, is over. So, I took it upon myself to report bugs in bugzilla where they are applicable, to help out developers when they get a chance to breathe.

First problem: Building returned a problem with “CarbonSound.h not available”. This was as a result of not yet installing the QuickTime 7 SDK. (It has been in software update, I just hadn’t touched it yet.) Updating fixed that.

Second Problem: 10.3.9 Build Failure: NSString may not respond to `+stringWithCString:encoding:’. This is a method which was not available in Panther. Maciej has said he is working on a patch to have this use CFString instead, where it is available. (I am tossing about some terms I don’t know here, so please excuse any incorrect terminology.) Workaround for the time being – copy the last two build commands before the crash (a cd line and a gcc-3.3 line) and past them, altering the gcc-3.3 line slightly to remove the -Werror. This means that it may cause problems later on, but will compile for the time being.

Third Problem: isnan failure in kjs_window.cpp: This one boggles me a bit, especially since (as I mention in the bug) there seems to be explicit knowledge in the code of the problem. However, a workaround is now offered in the bug in comment 1: replace using std::isnan; with extern “C” int isnan(double); This fixed the problem for me.

Fourth Problem: XSLT Headers not installed – This one is more systematic of the way that Apple releases updates, and is something that dajobe has brought up with building Redland in the past: “Headers don’t match libraries”. This is true here as well, but I now (thanks to toby from #webkit) know that the reason for this is that Apple does *not* ship updated headers with libraries updated through Software Update. Since libxslt is new in 10.3.9, there are no development headers. Dave Hyatt, of the WebKit team, mentioned that the whole team, when building on Panther, had to install libxslt and libxml from the source. Once I did this, it made this problem go away.

Fifth Problem: libxml headers are wrong – this was before I installed libxml, which also fixed this problem. It is, again, related to the fact that Apple does not update headers with System Update.

Once you get through these, you will have built both JavascriptCore and WebCore. Congratulations! You now have two completely useless frameworks which the new Webkit will depend on when you can build it! 🙂

WebKit is the previously unreleased Apple-specific Framework which is the “pretty” part of WebCore – it’s what ties everything together. It has a few more issues building on Panther, but most of them can be worked around by simply copy pasting build lines without the -Werror flag. (Note that this will produce possibly unstable results! These builds are not designed for production, and I do not advise doing this and filing bug reports on Safari crashing.)

npapi headers not available – for some reason, building on Panther does not find the appropriate headers from the in-process WebKit build. I really have no clue why this is, and neither did anyone else when I was building. My workaround was to copy the headers out of the framework and into ~/build/include (a directory I had to make), which was already on the path. cp ~/build/WebKit.Framework/Versions/A/Headers/* ~/build/include, cp ~/build/WebKit.Framework/Versions/A/PrivateHeaders/* ~/build/include, then continuing the build. I am not sure why this is neccesary, but it does seem to work.

Missing 10.4 Method -setCompositingOperation for WebImageRenderer – Two parts of the code require: (void)setCompositingOperation:(NSCompositingOperation)operation;
(NSCompositingOperation)compositingOperation; — this function was added in 10.4. This can be resolved by following the above -Werror removal steps. You will have to do this several times.

Missing 10.4 Method CFMakeCollectible – CFMakeColelctible is new in 10.4. Building with no -Werror allows the build to continue.

And, the current showstopper: Missing SecurityNssAsn1 headers — This comes from the libWebKitSystemInterface.a file, which is currently Tiger-specific. Once WWDC is over, a Panther binary file will be released. Until then, this is where the ride stops: you can build WebCore and JavascriptCore, but WebKit is out of your reach until you get your hands on Tiger.

Luckily for me, I’m going to be in Cupertino this weekend, so I’ll pick up a copy and get it installed soon 😉

Library in RDF

Posted in Delicious Library, RDF, Semantic Web, XSLT on June 5th, 2005 at 21:19:20

A long time ago, when I first got a Mac, there was a lot of hubbub about a program called “Delicious Library”: an application that would let you scan in your books, and provided an awesome user interface to searching, storing, lending, and everything else you might want to do with them. At the time, I wanted it, and I wanted it bad, but I decided to wait until I got an iSight: the idea of entering hundreds, perhaps up to a thousand, UPCs by hand, did not strike me as one of my favored tasks.

March 19th, I got an iSight: a birthday present, from Jess. I thought then “ooh, Delicious Library”, but never got around to it.

This weekend, I was starting to pack up books from the bookshelves. I thought “Hey, I won’t have a clue where any of the books are… unless…”

Jess was out of the house. I downloaded and tried the program: I scanned a full shelf of books (after getting some decent light) and packed them up before I hit my 25 limit and had to pay the piper. $40 for knowing where all of these books are after we move (as well as a new toy to play with) is well worth it.

I scanned another shelf (and ran out of boxes), then sat down to do the fun part.

First: xml2rdf – an XSLT stylesheet to convert from Delicious Library’s XML format to RDF. One of the biggest problems with this stylesheet is that it needs to know about the actual image files available from delicious library: this is where files.xml comes in, which is constructed using the following bash commands:

echo “<container>” > files.xml
for i in ~/Library/Application\ Support/Delicious\ Library/Images/Medium\ Covers/*; do
export j=`echo $i | sed -e ‘s!.*/!!’`
echo “<image size=’medium’ name=’$j’ />” >> files.xml
done
echo “</container>” >> files.xml

This is then used with XSLT’s document() function in order to load what files are available, to prevent from inaccurate <foaf:depiction>s being spat into the source: Amazon does not store cover images for some books, so until I implemented this fix, there were broken image references.

Next: convert.py – Load the file as an RDF model, delete all the existing dc:description statements, convert them from rtfreader from Brandon’s Program Archive

Next: Process through cwm for RDF pretty printing.

Next: rdf2html – taking the RDF output and converting it to HTML.

End result? Content negotiated version of the books I’ve scanned so far in the Books Library – RDF and HTML versions available.

This is some of my first major experience in XSLT, and I’ve found it to be pretty darn easy: far less difficult than I thought it was in the past. I think that I may go on an XSLT kick for the next couple weeks, so don’t be surprised if you see a lot more of my RDF looking a little bit prettier. For example, I already wrote an XSLT stylesheet for the FIF reviews I’ve received, so if you’re using a capable browser, that will be a lot nicer looking now than it used to be.

Google Sitemap Format

Posted in RDF, Semantic Web, XSLT on June 3rd, 2005 at 10:02:23

Josh points out Google’s Sitemap Protocol, via the SWIG Chump. I pull out my XSLT-foo (what little of it there is). I hack a bit back and forth, run into a problem which uche helps me figure out: “XPath does *not* use the default prefix in the stylseheet for purposes of matching”, fix my XSLT up a bit, and create a new RDF source under my semweb section: Google Sitemap Tools, including an XSLT stylesheet, example output, and a conversion service which uses the XSLT: For example, Google’s Example File in RDF.

Now, to find some sitemaps in action in the real world, and add gzip decoding of gzipped sitemaps.

Javascript, RDF Searching

Posted in Javascript, PHP, SPARQL on May 31st, 2005 at 11:29:06

I’ve been doing some playing with goofy Javascript stuff lately to try to get my head wrapped around it, since I’m going to be needing to implement it in a few tools at work in the near future.

I’ve so far used it in
1. An admin interface for Athena’s email accounts,
2. An inventory listing for a work project
3. The newest one, a “suggestion” field for Wordnet searches against the RDF store I just imported this morning.

Danny alerted me to the existence of a new Wordnet dataset. I grabbed the full set, dropped it into Redland, and set up a sparql search against it. The top box there is the nifty one though: type in a string (say, apple) and watch the right side as a list of suggestions is populated.

I still need to get it actually doing a Google Suggest-like dropdown box, but haven’t had the time to hack WICK to do what I want as far as that goes.

I’m still learning, and as such, the code is sucky. I wouldn’t recommend reading it for an example: it’s a quick hack, but it works. Still many bugs to work out – for example, if you type apple, it still searches for app, appl, apple in the process. But I’ll get there. (Okay, so I just did a few bug fixes that make it much better, and switched the search mechanism to use MySQL rather than an 11 Meg PHP array. Much better now.)

Anyway, I think it’s cool. RDF people can mark it down in the “another SPARQL datastore”, Javascript people can mark it down as “Another idiot trying to use XmlHttpRequest and doing it wrong.”

Lemme know if you’ve got suggestions!

XSLT + Image Regions + Sparql

Posted in Flickr, Image Description, RDF, SPARQL, XSLT on May 22nd, 2005 at 20:05:23

Read Masahide’s notes on XSLT+Image Regions. Used some tools to convert my flickr photos to RDF.

Converted an XSLT Stylesheet to a different result format. Loaded ~400 RDF files into a Model, totalling 33,000 statements. Added an option to my Sparql Interface. Changed the default query. Made the extra option add the stylesheet.

Ran a query. Tweaked until it worked. Typed it all up here, to share with all of you.

Hooray for masahide, flickr, and all kinds of other wonderful things.