Music analysis and retrieval using large datasets of symbolic musical data has been hampered by the lack of an adequate, standardized format for symbolic music representation supported by commercial software tools. This gap makes it difficult to acquire and reuse either musical data or musical tools. The tools that are developed for music analysis research do not have the technical underpinnings to scale up to large-scale commercial usage of music information retrieval. The need to use databases to build collections of symbolic music information is well understood [7], but the technology has been lacking.
Building scalable database systems is a costly undertaking. It makes more sense for music applications to leverage the investment of other, better-funded application areas such as electronic commerce, as long as that technology is adequate—not necessarily ideal—for the needs of musical applications
XML has the potential to finally break through the database barrier through the efforts of the World Wide Web Consortium’s XML Query working group. The group’s mission is “to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases.” [18]
The current focus of the XML Query Working Group is the XQuery 1.0 language. Though this language is still a work-in-progress, available only in working draft form, there are already a dozen prototype implementations available for evaluation. These come both from major relational database vendors like Oracle and Microsoft as well as native XML database vendors like Software AG.
The combination of an XML language for music and an XML query language is not sufficient by itself to break through the database barrier for music information retrieval. The two languages must be able to work together to solve musical problems. Early XQuery working drafts had significant problems in this area, lacking powerful facilities to deal with queries that combine aspects of sequence and hierarchy. These shortcomings have been addressed in the XQuery 1.0 working draft of April 30, 2002, and we have now been able to build our first interesting musical queries using XQuery and MusicXML.
Given XQuery’s importance and scope, it is likely to be some time yet before the language definition is completed, issued as a W3C recommendation, and commercial tools made available for effective development of XQuery applications. Fortunately, for research purposes, many analysis applications can be developed effectively today with existing tools: the XML Document Object Model (DOM) [17] and the XML Path Language 1.0 (XPath) [3].
Musical analysis is not just applicable in musicological research; it can also be useful in music publishing. For instance, as Recordare publishes its editions of classical art songs, it is helpful to show the range of each song. This process can be automated by a musical analysis program working on the MusicXML data. Figure 3 shows a screen shot from a program that generates a distribution graph of the pitch range for any particular part in a piece of music. Here we are computing the range for the voice part of the last song in Schumann’s Frauenliebe und Leben, Op. 42.
Figure 4 shows the synopsis produced by clicking on the “Report” button. It focuses on the low and high notes.
The program that generates this synopsis report is easy to write in MusicXML. For comparison, we will show two implementations. The first uses the DOM, programmed in Visual Basic 6.0 with Microsoft’s MSXML3 parser. An equivalent program can be built using XQuery. Our example uses the QuiP 2.1.1 prototype program from Software AG, which is based on the April 30 working draft of XQuery 1.0. QuiP and XQuery are both works in progress, so the syntax of a working program is likely to change by the time XQuery becomes a formal recommendation from the World Wide Web Consortium.
The DOM approach is implemented within a function that takes a MusicXML document and MusicXML part ID as input, and returns the dialog box string as output. After the initial variable declaration and initialization, the variable oNodes
is assigned to all the <pitch>
elements within the <part>
specified by the PartID parameter. The selection is made using XPath 1.0 syntax.
The program then loops through each pitch, calling the MIDINote
function to compute the MIDI note value from the different components of the <pitch>
element. If the resulting pitch is lower or higher than any seen before, the spelling of the note is saved in a variable, using a separate SpellNote
function on the same <pitch>
element. The measure containing the extreme pitch also saved.
After all the pitches are searched, the program returns a string composed from the saved values for the lowest and highest MIDI pitches, along with their musical spellings and the measure where they were first encountered.
Function FindRange _ (ThisXML As DOMDocument30, _ ByVal PartID As String) Dim oRoot As IXMLDOMElement ' Root of XML document Dim oNodes As IXMLDOMNodeList ' Pitches to analyze Dim oElement As IXMLDOMElement ' Current pitch Dim oMeasure As IXMLDOMElement ' Parent measure Dim lPitch As Long ' Current pitch Dim lMinPitch As Long ' Lowest MIDI pitch Dim sMinPitch As String ' Spelling of low pitch Dim lMaxPitch As Long ' Highest MIDI pitch Dim sMaxPitch As String ' Spelling of high pitch Dim sMinMeasure As String ' Measure for low pitch Dim sMaxMeasure As String ' Measure for high pitch lMinPitch = 128 lMaxPitch = -1 Set oRoot = moXML.documentElement Set oNodes = _ oRoot.selectNodes( _ "//part[@id='" & PartID & "']//pitch") ' Search each pitch for the lowest and highest ' values, saving the spelling and measure number. Do Set oElement = oNodes.nextNode If oElement Is Nothing Then Exit Do lPitch = MIDINote(oElement) If lPitch < lMinPitch Then lMinPitch = lPitch sMinPitch = SpellNote(oElement) Set oMeasure = _ oElement.selectSingleNode _ ("ancestor::measure") sMinMeasure = _ oMeasure.getAttribute("number") End If If lPitch > lMaxPitch Then lMaxPitch = lPitch sMaxPitch = SpellNote(oElement) Set oMeasure = _ oElement.selectSingleNode _ ("ancestor::measure") sMaxMeasure = _ oMeasure.getAttribute("number") End If Loop FindRange = "Lowest note is " & sMinPitch & _ " (MIDI " & lMinPitch & _ ") in measure " & sMinMeasure & vbCrLf & _ "Highest note is " & sMaxPitch & _ " (MIDI " & lMaxPitch & _ ") in measure " & sMaxMeasure End Function
clear
The MIDINote
function builds the MIDI note number by reading the <octave>
, <step>
, and <alter>
elements in turn to build the note number value. The CLng
function called here casts the string returned by the XML Element into a 32-bit integer (the Long
type in Visual Basic 6.0).
' Return MIDI note value from a MusicXML pitch ' element, ignoring microtones. Function MIDINote _ (ThisPitch As IXMLDOMElement) As Long Dim oElement As MSXML2.IXMLDOMElement Dim lTemp As Long ' Temporary pitch ' Get octave Set oElement = _ ThisPitch.selectSingleNode("octave") lTemp = 12 * (CLng(oElement.Text) + 1) ' Get pitch step Set oElement = _ ThisPitch.selectSingleNode("step") Select Case oElement.Text Case "a", "A": lTemp = lTemp + 9 Case "b", "B": lTemp = lTemp + 11 Case "c", "C": lTemp = lTemp + 0 Case "d", "D": lTemp = lTemp + 2 Case "e", "E": lTemp = lTemp + 4 Case "f", "F": lTemp = lTemp + 5 Case "g", "G": lTemp = lTemp + 7 End Select ' Get alteration if any Set oElement = _ ThisPitch.selectSingleNode("alter") If Not oElement Is Nothing Then lTemp = lTemp + CLng(oElement.Text) End If ' Assign and exit MIDINote = lTemp End Function
clear
The SpellNote
function is even more straightforward, as the only conversion that needs to be done is to go from the numeric <alter>
value to a text symbol for the sharps and flats in the note spelling.
' Spell the pitch as a string, e.g. "C#4" Function SpellNote _ (ThisPitch As IXMLDOMElement) As String Dim oElement As IXMLDOMElement Dim sSpell As String ' Temporary string Dim sAlter As String ' Alteration string ' Get pitch step Set oElement = _ ThisPitch.selectSingleNode("step") sSpell = UCase$(oElement.Text) ' Get alteration if any Set oElement = _ ThisPitch.selectSingleNode("alter") If Not oElement Is Nothing Then Select Case CLng(oElement.Text) Case -2: sAlter = "bb" Case -1: sAlter = "b" Case 0: sAlter = vbNullString Case 1: sAlter = "#" Case 2: sAlter = "##" Case Else sAlter = "(" & oElement.Text & ")" End Select sSpell = sSpell & sAlter End If ' Get octave Set oElement = _ ThisPitch.selectSingleNode("octave") sSpell = sSpell & oElement.Text ' Assign and exit SpellNote = sSpell End Function
clear
Our XQuery implementation follows a similar approach to the DOM implementation. Since QuiP is a standalone prototype tool for learning XQuery, we have hardcoded the file name and part ID that were parameterized in the DOM example. This example takes a very simple approach to the query, reviewing all the pitches twice in order to locate the minimum and maximum values. Once we have these values, we then find the pitch elements whose MIDI note values match the high and low values. XQuery results are returned in XML format, so we do not need a SpellNote
function. We simply output the first <pitch>
elements that match each of the extreme values, and then find the number of the measure that contains the first instance of these matching elements. XQuery makes use of XPath 2.0 and does not support the ancestor::
axis, so our query assumes the <measure>
element is the grandparent of the <pitch>
element. Therefore this query will only work with partwise MusicXML files, not timewise files. We have revised the syntax slightly to better match the XQuery working draft, using the string
function where QuiP 2.1.1 used the string-value
function. [See updated examples for the November 15, 2002 XQuery working draft, published after this paper was presented.]
define function MIDINote(element $thispitch) returns integer { let $step := $thispitch/step let $alter := if (empty($thispitch/alter)) then 0 else if (string($thispitch/alter) = "1") then 1 else if (string($thispitch/alter) = "-1") then -1 else 0 let $octave := integer(string($thispitch/octave)) let $pitchstep := if (string($step) = "C") then 0 else if (string($step) = "D") then 2 else if (string($step) = "E") then 4 else if (string($step) = "F") then 5 else if (string($step) = "G") then 7 else if (string($step) = "A") then 9 else if (string($step) = "B") then 11 else 0 return 12 * ($octave + 1) + $pitchstep + $alter } let $doc := document("MusicXML/Frauenliebe8.xml") let $part := $doc//part[./@id = "P1"] let $highnote := max(for $pitch in $part//pitch return MIDINote($pitch)) let $lownote := min(for $pitch in $part//pitch return MIDINote($pitch)) let $highpitch := $part//pitch[MIDINote(.) = $highnote] let $lowpitch := $part//pitch[MIDINote(.) = $lownote] let $highmeas := string($highpitch[1]/../../@number) let $lowmeas := string($lowpitch[1]/../../@number) return <result> <low-note>{$lowpitch[1]} <measure>{$lowmeas}</measure> </low-note> <high-note>{$highpitch[1]} <measure>{$highmeas}</measure> </high-note> </result>
clear
This query returns the following result in XML:
<?xml version="1.0"?> <result> <low-note> <pitch> <step>C</step> <alter>1</alter> <octave>4</octave> </pitch> <measure>16</measure> </low-note> <high-note> <pitch> <step>D</step> <octave>5</octave> </pitch> <measure>12</measure> </high-note> </result>
clear
Melody retrieval provides a more typical XQuery example, using a FLWR (for-let-where-return) expression. Here we are looking for the instances of the Frere Jacques theme in the key of C. We simply this query to look just for the pitch step sequence of C, D, E, C. This query also assumes a partwise MusicXML file. It will match instances of the pitch sequence that cross <measure>
boundaries, but will not match across <part>
boundaries:
<result> {let $doc := document("MusicXML/frere-jacques.xml") let $notes := $doc//note for $note1 in $notes[string(./pitch/step) = "C"], $note2 in $notes[. follows $note1][1], $note3 in $notes[. follows $note2][1], $note4 in $notes[. follows $note3][1] let $meas1 := $note1/.. let $part1 := $meas1/.. let $part2 := $note2/../.. let $part3 := $note3/../.. let $part4 := $note4/../.. where string($note2/pitch/step) = "D" and string($note3/pitch/step) = "E" and string($note4/pitch/step) = "C" and (string($part1/@id) = string($part2/@id)) and (string($part2/@id) = string($part3/@id)) and (string($part3/@id) = string($part4/@id)) return <motif> {$note1/pitch} {$note2/pitch} {$note3/pitch} {$note4/pitch} <measure>{$meas1/@number}</measure> <part>{$part1/@id}</part> </motif> } </result>
clear
When run against a simple three-part round of Frere Jacques prepared in Finale and exported to MusicXML, the query returns six instances of the motif, the first of which is shown below:
<?xml version="1.0"?> <result> <motif> <pitch> <step>C</step> <octave>5</octave> </pitch> <pitch> <step>D</step> <octave>5</octave> </pitch> <pitch> <step>E</step> <octave>5</octave> </pitch> <pitch> <step>C</step> <octave>5</octave> </pitch> <measure number="1" /> <part id="P1" /> </motif> <motif> <!-- Remaining 5 motifs removed for brevity --> </result>