CS 141 Introduction to AI Greenwald

TBA

12:00 PM (Noon!), May 11, 2009

Project 5: MySong

Due:

Contents

    1  Introduction

    2  Music 101

    3  The HMM

    4  What to Do

    5  Notes
                Log Probabilities

    6  Handing In

    7  Support Code

    8  Resources

[mysong-Z-G-1.gif]

1  Introduction

Microsoft SongSmith (formerly MySong) is a research project led in part by Dan Morris (’00). Here is an overview of MySong, excerpted the MySong website:

MySong, introduced in our CHI 2008 paper, automatically chooses chords to accompany a vocal melody, allowing a user with no musical training to rapidly create accompanied music. MySong is a creative tool for folks who like to sing but would never get a chance to experiment with creating real original music. Come on, you know who you are...you sing in the car, or in the shower, or you go to karaoke clubs, or you just once in a while find yourself singing along with catchy commercial jingles. MySong is also a great tool for songwriters who want to quickly experiment with melodies and accompaniments.

It turns out that the technology underlying MySong are a couple of relatively straightforward HMMs, in which chords are states and notes are observations. Using this model, your goal in this project is to implement the Viterbi algorithm to decode a melody (i.e., a sequence of notes) into a chord progression. That is, you will build your own MySong (a.k.a, YourSong).

2  Music 101

In order to understand this write-up, you’ll need to know some basic music terminology and concepts.

You do not need to know any music theory beyond the basic definitions given above to do this project.

3  The HMM

Recall that a hidden Markov model (HMM) is a tuple H = ⟨ X, Y, Π, A, B ⟩, where X is a finite set of states, Y is a finite alphabet of observations, Π is an initial probability distribution over states, A is a transition probability matrix, and B is an emission probability matrix. For this project, we extend the definition of an HMM to also include Ψ, a final probability distribution over states.

As we noted above, in the MySong HMMs, chords are states and notes are observations. Your goal in this project is to write a program that will take as input a melody, that is, a sequence of notes, and generate a corresponding sequence of backup chords. More specifically, your program should use the Viterbi algorithm to compute the most likely chord sequence, given a melody.

Here are some more details about the MySong HMMs.

Here are some important details about the way in which we represent a melody, the input to your program:

[mysong-Z-G-2.gif]
Figure 1:  A graphical illustration of a single observed measure (normalized). The notes C, E, and G occurred most frequently during this measure.

4  What to Do

  1. Implement Viterbi to decode a melody in which the observation corresponding to each measure is reduced from a histogram of pitches to a single representative pitch. How you determine this note/pitch is up to you.

  2. The usual implementation of Viterbi applies to a sequence of observations y1:T in which each ytY. The goal of this problem is to extend Viterbi to apply to a sequence of observations y1:T in which each yt∈Δ(Y), the set of probability distributions over the set Y. More specifically, you should extend your implementation of Viterbi to decode a melody in which the observations corresponding to each measure are full histograms. To do so, instead of making use of only bk, the kth column in the emission probability matrix, as the usual Viterbi algorithm does whenever yt = kY, you should find another way to reduce the emission probability matrix to a single column—a way that incorporates all the available information about the observation yt∈Δ(Y) (not just a single statistic such as the mode).

  3. For extra credit, you can play around with the various probability matrices and/or with the Viterbi algorithm itself to see how small changes impact the chord progressions output by your program. If your chord progressions become more interesting after tweaking the probabilities and/or the algorithm, document your method. We plan to relay successful methods to Dan. Your method could be incorporated into the next release of SongSmith!

5  Notes

The transition, start, and end probability matrices are specific to the key C Major for the major model and A Minor for the minor model. This is why Π is peaked at chord 0 (C Major). A song in C Major is more likely to start on a C Major chord. However, not all of the melodies given are in C Major; in fact, most are not. Hence, for each model (major or minor), you should run the Viterbi algorithm 12 times, once per key, and then return the path with the highest probability.

In order to run Viterbi in each major key, you will need to change how you index the Π, Ψ, and A matrices, as they were all built for C Major. For instance, in order to treat the Π matrix as if it is in the key of C# Major, you will need to offset the index of each chord by - 5. That way, the start probability of the fifth chord, C# Major, will be set to the value π0. The same type of shift in indexing is required for the minor model, where all probability matrices are specific to A Minor.

Log Probabilities

Underflow is common when multiplying small probabilities, so you may find it preferable to add the logarithms of probabilities, rather than multiply the probabilities themselves. This trick is simple to apply to Viterbi, where the two only operations are multiplication and maximization. It is more difficult to apply to the the forward and backward algorithms (to compute α and β), because there the two operations are multiplication and addition.

6  Handing In

Please hand in the following:

  1. All the code necessary to solve the assignment (including our support code).

  2. A README file that includes the following:

    1. Standard bug reports.

    2. Anything that might help us read your code: design quirks, conventions, etc.

    3. Comments that might help us improve and/or extend this project in the future.

  3. A short report of how your second implementation worked, and how it differed from the first.

Hand in by typing cs141_handin mysong in the shell from the directory containing your work.

7  Support Code

As usual, install the support code by running the install script:

 \texttt{/course/cs141/bin/cs141install mysong}

The support code will be installed into your  /course/cs141/projects/ directory.

The support code is split into three packages: input, mysong and player.

The file pitchtrack.praat will be installed with the stencil code. This must be in your outer most mysong directory, as it is the script that is called by the PitchTracker.

We recommend working with eclipse for this project. Be sure to make the mysong folder the project folder, so Java knows where to find the pitchtrack.praat file.

A models folder will also be installed with your code. This includes the major transition model, minor transition model, and the observation model (i.e., the emission probability matrix). You need not touch these. But if you accidentally delete one or more of them, a copy of the folder lives in /course/cs141/lib/mysonglib/models.

The .wav files available as inputs to your program are located in /course/cs141/lib/mysong/wavfiles. The filenames are formatted like this: "filename.120.wav", where the number is the tempo (in beats per minute) of the melody. All provided files have four beats per measure.

Note: To hear the the input to and the output of your program, you will need to plug headphones into the back of your computer. In order to change the volume on the department machines, type gnome-volume-control into a shell. This will give you a volume control interface for the sound card. You can test your sound by typing play <wavfile> into a shell. If your sound is not working, a consultant can fix it for you. This is simply an ownership issue (if someone logged in on your machine before you, he or she owns the sound card, and you cannot use it).

8  Resources

  1. First MySong Paper:
    http://research.microsoft.com/en-us/um/people/dan/mysong/MySongCHI2008.pdf

  2. Second MySong Paper:
    http://research.microsoft.com/en-us/um/people/dan/mysong/MySongAAAI2008.pdf

  3. Mysong Homepage:
    http://research.microsoft.com/en-us/um/people/dan/mysong/

  4. Viterbi Algorithm on Wikipedia:
    http://en.wikipedia.org/wiki/Viterbi_algorithm

Last modified: Friday, May 1st, 2009 11:08:19am