Archive for the 'Image Description' Category

Flickr Image Region Selection

Posted in Flickr, Image Description, SPARQL on May 26th, 2005 at 22:58:33

One of the things I’ve noticed with my Image Region stuff, which I posted about recently, is that it’s slow. I didn’t really think about why: at first, a lot of it seemed to have to do with the client side XSL, or the CSS cropping of gigantic images.

However, I’m now realizing that this is using a regex with a pretty heavy query: The kind of query that I wouldn’t want anyone to run against julie, because it would just take too long.

The reason for this is Redland’s current REGEX implementation: It basically loads all the literals out of the store and does a regex against them after it has them, which is obviously not ideal.

With that in mind, I tried to think of interesting queries which could be done without requiring a regex, and came up with the idea of flickr images searches: show me a closeup of all the regions in a flickr image of mine.

So, now there’s an additional search box on my SPARQL interface: Flickr ID/URI. It then uses the foaf:page part of the photo to query against, which is obviously much faster.

Maybe I’ll expand this: let people put in any flickr photo ID, and display the information using XSLT against an RDF datasource, with a link to the output of the datasource. I’ve got all the tools to do it now running in Python locally, so I don’t think it would be too difficult: I would need to get some error parsing together though. I really wish I could tie PHP / Python code on the web together more easily though…

Anyway, an example: Flickr Page to RDF generates Regions.

Try it out at The SPARQL search. As always, data and query are shown inside the source of the page, at the bottom.

More on Image Regions

Posted in Flickr, Geolocation, Image Description, RDF, SPARQL on May 23rd, 2005 at 18:43:40

My post last night was a bit cryptic, so let me walk through a bit more clearly what I’ve been doing, since I seem to have picked up the interest of some more people.

I currently am using Flickr to annotate my photos: primarily because I like their image region annotations, and partially because their API offers me a way to get lots of data out that I’ve put in, which is useful to me. So, that’s what I’m using for photo annotation at the moment, which may change at any point.

Masahide has a flickr2rdf service: flickr2rdf takes a Flickr Photo page URI and exports RDF from it: For example, a picture of myself, my ex girlfriend, and Foghorn Leghorn can be seen, fully annotated, using XSLT+RDF, via the flickr2rdf tool.

Additionally, the original photos stored at flickr (full size) have EXIF information: this information can be exported via Masahide’s equally cool exif2rdf tool: Foghorn Leghorn Example.

Once I have the photo_id of a photo, I can collect all these statements together. Additionally, since I am using tags from GeoBloggers for geolocation, I have a tool which parses out these tags (using the Flickr API) and creates Geo data for them.

I add a few tracking statements: specifically, links to seeAlso the final RDF/XSLT view of the image, (again, Foghorn Leghorn example). I serialize the Model out from Redland, and get a directory full of files full of RDF singletons. From here, I use cwm to process the singletons into an abbreviated RDF/XML file. These files are then synced to the http://crschmidt.net/albums/flickr/ directory. Here, I use a couple little tricks to add an XSLT declaration as the first line of each file, so that the content negotiated version offers XML delivered as application/xml, rather than just application/rdf+xml (which Firefox won’t display in a browser).

Next step is to add each of these files into an RDF model. Since I’m still occasionally changing statements, I’ve been dropping the whole model and readding every time: this doesn’t take too long, as it’s only a few hundred files, and Redland is speedy quick.

So, now we have a database full of RDF statements. Fine. But that’s not too useful. So, I have my SPARQL query interface. Which is all well and good, for people who have lots of knoweldge of RDF. It can provide some cool results.

But it doesn’t really do anything *fun*. So last night, I added an optional checkbox, that said “If you ahve something in a specific query format, process an XSLT file against it”. I tweaked this XSLT from masahide’s example, linked yesterday, into what it is now, which you can see, if you’re interested.

Well, that’s all well and good, but most people don’t understand SPARQL enough to know what they should type in. What’s the use of having to learn a language just to see some pictures? So, my next step was to add a search box specifically for Regions: my sparql page has a search box now specifically for this purpose.

I realized after a couple times, though, that using client side XSLT to process the XML was really slow, clunky, and generally ugly. Not to mention that Mozilla’s XSLT doesn’t let me disable-output-escaping on variables: so, I installed php4-xslt, and started using that implementation on the server side.

Yeah, that’s all well and good too, but now my pretty RDF with queries and all went away! So, I added them back: at the end of the Foghorn Search, in a comment, you’ll see:

Generated using the XSLT stylesheet at http://crschmidt.net/xslt/imgreg.xsl against the data generated by the query:

PREFIX dc: <http ://purl.org/dc/elements/1.1/>
PREFIX foaf: <http ://xmlns.com/foaf/0.1/>
PREFIX imgreg: <http ://www.w3.org/2004/02/image-regions#>
SELECT ?img,?title,?page,?desc,?atitle,?coord
WHERE {
?img
dc:title ?title;
foaf:page ?page;
dc:description ?desc;
imgreg:hasRegion ?r.
?r
dc:title ?atitle;
imgreg:coords ?coord.
FILTER REGEX(?atitle ,”Foghorn”) }

Data was:

followed by the XML version of the SPARQL query results.

Another interesting example: Schmidt – myself, family members, and others.

Anyway, being a bit more informative seemed appropriate given the situation. So there’s my implementation toy of the day.

XSLT + Image Regions + Sparql

Posted in Flickr, Image Description, RDF, SPARQL, XSLT on May 22nd, 2005 at 20:05:23

Read Masahide’s notes on XSLT+Image Regions. Used some tools to convert my flickr photos to RDF.

Converted an XSLT Stylesheet to a different result format. Loaded ~400 RDF files into a Model, totalling 33,000 statements. Added an option to my Sparql Interface. Changed the default query. Made the extra option add the stylesheet.

Ran a query. Tweaked until it worked. Typed it all up here, to share with all of you.

Hooray for masahide, flickr, and all kinds of other wonderful things.

RDF and Images

Posted in Image Description, Semantic Web, SPARQL on May 8th, 2005 at 12:30:42

Tony Lounging

I know that I’m far too lazy to actually describe my images. I never do it. I write tools to help me, and I still don’t. So, my goal is to use tools which do it for me. With Masahide’s EXIF tools, flickr, and flickr2rdf, I can do this, with a little fudging to get the output to flow together better.

I have a lot of photos to describe, and I was going to get to work on it, when I reached for my keyboard… and found the cat laying on it. So, I switched to the other computer (zeus, rather than hermes) and got to work, creating a SPARQL interface for my photos. Maybe if I can search them, I’ll actually describe them.

I haven’t done a whole lot yet, but the start of my work is in place, with a nice SPARQL query against it. Of course, so far there’s only one photo, but this example should get you started, and if you care, you can check out the data to get you started.

Search My Photos – the crschmidt.net album organization service.

Image Storage

Posted in Image Description, Semantic Web on January 14th, 2005 at 11:59:55

Earlier, I mentioned how I was going to be switching to Flickr for Image storing. Although this is still the case, I’m waiting a bit until I do: I want to write some tools for generating RDF for me personally before I spend the money on it, I think. There’s already some work being done: Masahide announced a flickr2rdf service of his own, but it’s generalized, and I want to do something a bit more specific. I want to be able to describe people in the description, and have a tool which automatically extracts it. Not quite “Natural Language Processing”, something I’m sure that Arnia will chide me for, but something along those lines. Maybe even just using n3 in HTML Comments? Wonder how the API deals with that…

However, in the meantime, I’ve installed Gallery2, and I must admit, it’s very nice. I’ve been working in the module framework, and I’ve been able to extract exif information and include it in an RDF RSS Feed. Next step is to go back through my steps and look at all the changes I’ve done, then export it so it’s a usable format for the developers, and integrate it into their RSS 2.0 development. Triples, triples everywhere!

Once I’m sure my Gallery is in a usable state, I’ll release the work I’ve done on it. For the time being, I’m making sure that I have something I’m willing to switch to before I publicize it.

Hope everyone has a good weekend!

Flickr and RDF

Posted in Flickr, Image Description, RDF on January 6th, 2005 at 23:25:30

I’m an open minded kind of guy. There are a lot of services out there, and even though some of them aren’t open source, it’s possible that they may do what I want them to do. One of those services is Flickr, a photo sharing service.

Flicker does a lot of very nifty things: updates from anywhere, advanced annotations, including an extremely easy to use Javascript interface for annotating regions of images, and posting to blogs directly from the service. However, that’s not the coolest aspect of the service, in my opinion.

Flickr provides a relatively advanced, full featured, well documented API – a way to get pretty much any information you might want out of the site without screenscraping. (LiveJournal, in comparison, strongly discourages screenscraping, preferring that you use services listed on their bot-friendly services list. However, the data afforded through these interfaces is extremely limited to the point where it’s unusable for most advanced tasks.) Through this API, you can retrieve all the information you want about the people and photos available through the service.

This is especially interesting to me as an RDF nut because it means that I can use Flickr’s nice interface and handy annotation tools – and at the same time, I can convert the data, via the API, to an RDF format that I can use for all the things I’ve been describing in my Image Description posts.

The limits a free Flickr account places on you are kinda strict: relatively small upload limit, given that I prefer to store full size images in the photos I already have in my personal gallery. I’d immediately set Flickr aside months ago because there was no way I could use it to store all my images. However, upon review tonight I discovered that an annual Flickr account during their beta period is only about 45 USD. Included in this is:

  • 1 GB monthly upload limit
  • Unlimited Storage
  • Unlimited Bandwidth

In addition to a few others, listed at their upgrade page.

It’s a case where I can build lots of tools and do lots of work myself, and get exactly what I want… or I can use flickr, pay a pretty minimal fee, and get 90% of everything I want with no effort, plus an additional bit of work to get that last 10%. I’ll probably still keep things locally (if for no other reason as a backup should flickr ever go poof), but move my primary photo gathering to be flickr based.

I think I know which way I’m going to go. Once I do, I’ll keep you all updated on the progress I have with RDF.

RDF From LiveJournal

Posted in Image Description, RDF on January 5th, 2005 at 02:19:11

Typepad exports RSS 1.0, as well as FOAF. LiveJournal exports limited FOAF information because that’s really all I could squeeze out of it the first time around, when bandwidth and load time was a major concern. I wonder if with the possible buy out of LiveJournal by SixApart, some of the LiveJournal specific XML formats will change to RDF…

For example, the Latest Images from LiveJournal is an XML format… it might be interesting to see this as an RSS feed, especially RSS 1.0. Perhaps the creation of feeds for the LiveJournal photo gallery option…

I can see a lot of room to expand the semantic content that LiveJournal emits, and I wonder if Six Apart might be in the right place to do it. As far as I’ve seen, they’ve seemed interested in doing so with Typepad. Does anyone know if something similar runs through more of their work? I’d love to see more RDF coming out of LJ. It’s a large data source and so much of it goes untapped because getting anything into the development process is very difficult.

Mm, LiveJournal sized image annotation stores… advanced semantic web development at its finest. Now that’s something I’d love to see. Imagine being allowed to check which LiveJournal users were in a photo, and then having that data emitted as RDF… allowing searches over it, finding photos with more than a certain number of people, photos of places. Geolocation. The UI isn’t hard, it would just be a bit of hackwork to get it together.

If there’s one thing I hope for, it’s that this leads to quicker turnaround on LiveJournal development time. Whether it’s with volunteers or not, I’d be willing to double my annual cost if it meant I got to see new features within a year, rather than three. That kind of interface would bring such a boost to the semantic web…

Now I’m tempted to build an external tool myself, just to demo how cool it would be.

Geolocation

Posted in Bluetooth, Geolocation, Image Description, OpenGuides on January 2nd, 2005 at 23:22:21

Geolocation is the technique of determining a user’s geographic latitude, longitude and, by inference, city, region and nation. There are a number of ways to do this: one of the common ones discussed on the internet (according to a Google search for “geolocation”, as we all know Google is the Answer) is geolocation via IP address. The kind I’m interested in is much more accurate: geolocation via GPS device.

I want to be able to know where I am. I want this for a lot of reasons, most of them geeky rather than actually reasonable. However, it would be nice to offer more specific statistics on where my pictures are actually taken with fine grain granularity that a GPS can offer. Additionally, some of my alternative projects – cell based geolocation and the like – could benefit from actual coordinates on which to base everything from restaurant locations to searches. Openguides is, in particular, one area that could benefit from this.

I want something that works over bluetooth. My laptop and phone both speak bluetooth, and something with an actual display is out of my price range, for the most part, so I want something I can use my phone to get data out of. (USB / serial obviously doesn’t work for that.) From what I understand, most GPS devices which support NMEA are going to work okay for communication, as there are tools out there which support them. (Whether I can get the thing to talk over bluetooth is a different concern, but one I’m becoming more proficient at every day.)

For a long time, all I could find for Bluetooth GPS devices were 200-250 USD and up. However, while discussing it with someone in #mobitopia on Freenode, I found the Delorme Bluelogger, a Bluetooth GPS device for $150. Matt already posted about our discussion, but I hadn’t yet.

I have some cash left over from Christmas, and I know that I almost never actually buy anything for myself. So, I’m going to splurge, and I’m going to get it. I’m going to learn to use it, and I’m going to do all kinds of neat things with it. Plans include:

  • GPS Annotation of Photos – This rolls into my photo annotation project, and is part of the reason I was keen to get it done: I want to actual have some fun queries for normal people (rather than just RDQL-aware people) over my photos.
  • Location Based Description of things for Openguides – Describing where things are with GPS coordinates allows searches by distance. Once I have that, the guide allows more niftyness.
  • Association of Cell IDs to Geo locations – Tied to the previous, this allows me to know where I am based on a Cell ID: Useful for “what’s nearby”, as well as useful for the general “where are you” that I like to be able to do – with just my cell phone.

All in all, some of the apps I have in mind seem nifty, some geeky, some just demonstrative of something bigger. Some are RDF related, some are just fun. The Bluelogger seems like a decent tool to achieve everything I need to.

Image Annotation

Posted in Image Description on December 26th, 2004 at 23:12:27

Over the weekend, I had some fly time to work on my image annotation application. I had asked for a bit of help on the way to get input in Python, and sbp pointed me towards “raw_input()”, which is what my application is based around.

Originally, it was going to be written using the Redland python bindings. I had prepared myself for the flight by installing Redland, and browsing a bunch of pages with annotation examples, which Slogger grabbed for me and stuck into a local log. With this, I thought I was prepared. So, I got on the plane, got past 10,000 feet, opened the laptop, executed my program (I already had about 20 lines of code)… and smacked myself in the head as it complained that there was no module RDF.

You see, Redland has two parts: the library itself and the language bindings. You kinda need both for working in Python.

So, after a little bit of thinking, I remembered that I had installed rdflib for testing of n3p, and decided that I would convert my existing code to that. In about 15 mintues, I was back up and running.

The program is simple, although it’s still lacking some important functionality. It basically just asks a series of questions about the image you tell it to annotate. Sample program input is available, as well as the sample output, and the sample passed through cwm to demonstrate how it looks when cleaned up.

You’ll notice that there’s data there that I didn’t enter: that data is brought in from a FOAF file. This file is only specified in the code at the moment. The code intelligently works on the name you give and checks for either a nickname or name matching: if there are multiples, it provides you a list from which you can choose a number given. In any entry form, you can just skip enter to either accept the default or skip past it.

The source is available, as depiction.py. This source code is messy, the way it’s laid out is very procedural, and you’ll have to modify things inside the code in order to get it to work for you. (Specifically, the foaf.rdf file is hardcoded to the location of mine.) The wordnet features are the newest and the least complete. I’m going to continue working on it, and the application is not considered even alpha-level release yet. However, I know that other people are interested in the arena, so feel free to take the code. You will need RDFLib to run it. It is MIT licensed. Share and enjoy!

Image Description – What I want

Posted in Image Description on December 19th, 2004 at 09:31:48

I have about 1500 images that I’ve taken over the last 1.5 years since I got a digital camera. These photos vary wildly in what they contain, from parties with friends to buildings to family photos to stuff around the house. There is a variety of images depicting people from all walks of life, from my college friends to my family now.

I want to search them.

I’ve become very convinced that the best way to store the information about these photos is in RDF: It’s the only data description framework that lets me extend and expand my descriptions to include everything I need. (There’s also the fact that I’m an RDF nut, so everything is a case for RDF.) Once the data is created, adding it to an RDF store and asking that store questions will not be difficult; the main problem is to get the data created.

There are some web based tools for photo/image annotation: mortenf uses a tool he created which spits out ntriples, and there is also the java app attached to the foaf codepiction project. However, neither one of these lets me describe my images very easily; There’s a lot of effort to go through for it.

My ideal photo annotation tool works like this:

  • Provide a URL to a photo.
  • This photo may or may not have EXIF data included. If it does, use the exif data to generate the appropriate RDF; something along the lines of the EXIF metadata created by Mashide’s exif2rdf web app would be good.
  • Allow me to describe the image with a title and description: dc:title and dc:description.
  • Allow me to describe the person who took the photo: foaf:maker. This part can work from the existing triple store I have; Let me enter a name, and then use that name, along with an mbox_sha1sum, to describe the person.
  • Offer me a license list: a list of CC licenses to choose from.
  • Allow me to describe “things” that are included in the photo: entering a term and checking it against wordnet, then using the right sense of the word.
  • Allow me to describe “people” in the photo: enter a name or nickname, and then present me a list of options from my triple store, or the option to describe a new person. Once I choose a person, a name + IFP should be included in the output so that I can smush the data together.

I’d like for the entire interface to be text based. I don’t want a web or graphical app: with 1500 photos, that’s just too damn much. I want to be able to describe these things quickly and easily, so very little should get in the way, with as few keystrokes as possible. If I want to skip something, I can just hit enter to skip over it.

I’m lucky in some respects, because some of the hard work of gathering information about people is already done for me: I have a large triple store at hand that can help me with this project. I also have some limited experience in working with the Redland framework, so I may even have the experience I need to make this work. So, here’s how I see it working on the inside:

Create an interface along the lines of what Config::Tiny (in perl) does: Question [default] ? then allow for the user to either skip past (and use the default, which may be none in some cases) or to use their option: if their option, possibly provide a list of choices after that. For example, if I enter “Jessica” as a name, it should tell me “There is no name Jessica in the database: would you like to create a new user?” with a yes or no option; if no, it would go back to the previous step and let me enter a different name. Dump the data, as its created, to an in-memory Redland model. Once the photo description is complete, serialize the model to a file, which can then be run through cwm to clean it up for more permanant storage.

I’m probably going to actually get off my butt and write this pretty soon: I have a lot of interest in describing this data, in part because it’s another case where a nifty project can lead to a great demonstration of why and how RDF can work. I’m most interested in someone who may have suggestions on what kind of tools would be good for creating the interface: the last time I did any kind of user input other than command line was when I was working with cin>> statements in C++. I’m probably going to write it in Python (I feel that I really accomplish things much more quickly writing in that language than in either perl or PHP), so tips on how to create command line user interfaces in Python are appreciated.