More Musicbrainz…

What I posted about yesterday was obviously too ridiculously difficult to actually be a real solution to the problem. So, I set about making something that works at least a little bit better.

It’s possible to generate “TRM”s for songs you have. These TRMs are basically accoustic identifiers for the track: they let you identify the song based on the way it sounds. This is how Musicbrainz does its identification. Yesterday, I installed a bunch of musicbrainz stuff in an effort to get this working, and did end up finding something that will generate TRM files. My current song, Chumbawumba’s Tubthumping, has a TRM of 776643d0-9b47-4eb9-8d29-608fa9ccedcd.

So, I can generate TRMs: but that doesn’t get me very far. Now, I need to figure out the actual track associated. Since I’m doing this mostly non-interactively, I’m just going to use the most popular track with that TRM. (This doesn’t always work: for me so far this evening, it’s given me a ~80% accuracy rate). So, I fetch the RDF version of the TRM file: this can be retrieved from http://musicbrainz.org/trmid/776643d0-9b47-4eb9-8d29-608fa9ccedcd for the song I mentioned earlier.

The first song in the “tracklist” RDF bag is the one that is the best match, so I’ll grab that Track. I can then add that URI, and fetch the creator ID from that file. All these files can be tossed into the general RDF model I keep lying around, along with the turtle that I mentioned in the earlier entry: [a foaf:Person; foaf:nick “crschmidt”; menow:hasStatus [a menow:Status; dc:date “timestamp”; menow:listeningTo <trackuri>]].

Then, I can issue a query against the model: since I know the time, I only return the most recent result:

select ?t, ?n, ?d where (?p foaf:nick “crschmidt”) (?p menow:hasStatus ?s) (?s dc:date ?d) (?s menow:listeningTo ?o) (?o dc:title ?t) (?o dc:creator ?a) (?a dc:title ?n) AND ?d =~ /timestamp/

The end result? A couple hundred extra triples loaded into the global model, and I can see:

23:23:56 <crschmidt> ^listeningTo 776643d0-9b47-4eb9-8d29-608fa9ccedcd
23:24:02 <julie> 2005-03-30T04:24:01Z Tubthumping Chumbawamba

Some of the tracks I’ve been listening to tonight can be shown via:^q select ?t, ?n, ?d where (?p foaf:nick “crschmidt”) (?p menow:hasStatus ?s) (?s dc:date ?d) (?s menow:listeningTo ?o) (?o dc:title ?t) (?o dc:creator ?a) (?a dc:title ?n) AND ?d =~ /2005-03-30/. Feel free to stop by #julie on irc.freenode.net and try it!

Code for the above function in Julie:

mm = RDF.NS("http://musicbrainz.org/mm/mm-2.0#")                                                                            
rdf = RDF.NS("http://www.w3.org/1999/02/22-rdf-syntax-ns#")                                                                 
dc = RDF.NS("http://purl.org/dc/elements/1.1/")                                                                             
model = RDF.Model(get_storage(self.db_password))
trm = m.group(1)                                                                                                            
m = RDF.Model()                                                                                                             
p = RDF.Parser()                                                                                                            
p.parse_into_model(m, "http://musicbrainz.org/trmid/%s"%trm)                                                                
p.parse_into_model(model, "http://musicbrainz.org/trmid/%s"%trm)                                                            
tl = m.find_statements(RDF.Statement(None, mm.trackList, None))                                                             
tracklistBag = tl.current().object                                                                                          
t1 = m.find_statements(RDF.Statement(tracklistBag, rdf['_1'], None))                                                        
trackId = t1.current().object.uri                                                                                           
p.parse_into_model(m, trackId)                                                                                              
p.parse_into_model(model, trackId)                                                                                      
c = m.find_statements(RDF.Statement(trackId, dc.creator, None))                                                             
creator = c.current().object.uri                                                                                                                                                                                                                                
p.parse_into_model(model, creator)                                                                                                                                                                                                                              
tp = RDF.TurtleParser()                                                                                                     
timestamp = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())                                                              
turtle =  """[a foaf:Person; foaf:nick "%s"; 
 menow:hasStatus [a menow:Status; dc:date "%s"; 
 menow:listeningTo < %s>]].""" 
    % (split_origin(origin)[0], timestamp, trackId)                                                                                       
turtlestr = turtle                                                                                                          
for key, value in common_namespaces().items() :                                                                             
    turtlestr = "@prefix %s: < %s> .\n%s" % (key, value, turtlestr)                                                          
tp.parse_string_into_model(model,
  turtlestr,RDF.Uri("http://crschmidt.net/julie/data/"))                                   
self.query_thread("""select ?t, ?n, ?d where (?p foaf:nick "%s") 
  (?p menow:hasStatus ?s) 
  (?s dc:date ?d) 
  (?s menow:listeningTo ?o) 
  (?o dc:title ?t) 
  (?o dc:creator ?a) 
  (?a dc:title ?n) AND ?d =~ /%s/""" 
  % (split_origin(origin)[0], timestamp) ,origin,args,model)                                                                                                                                

2 Responses to “More Musicbrainz…”

  1. Philip Newton Says:

    I assume the line ending in lots of spaces plus “c = m.find_statements(RDF.Statement(trackId, dc.creator, None))” is intended to be two lines… could you please change that?

    It’s breaking my friends page on LiveJournal, and when I go directly to your blog, the end of the line doesn’t appear, so it seems as if c is never assigned to before being used.

  2. Christopher Schmidt Says:

    Philip: Changed. sorry about that.