Monday, November 17, 2008

Monday, 17 November 2008

I got the HMM w/ DAL features working. Actually it uses GMM's (Single state hmms) with a bigram language model. It gets about 30% accuracy with 9 emotional categories. I need to try
  • using different categories,
  • using more mixtures, and
  • also seeing if I can use higher order n-grams.

I switched to testing out latent dirichlet allocation for now. To get that working, I need to
  • normalize the text,
  • convert the text vocabulary items to integers
  • encode the text in each document (here, utterances) as the counts of each vocabulary item (represented as integers).
  • figure out the lda software
last week I improved slightly improved my productivity except one day I didn't do any research due to Ema experiments. I'm averaging ~6.5 hours of research time, if I don't include the day that I didn't work. Including that day puts me to only about 5 hrs per day on average.

Now that I 'm using my bike instead of my car, I'm going to have to adapt a little.

Wednesday, November 5, 2008

Wednesday, 5 November 2008

Thank you God for president-elect Barack Obama! Thank you humanity for voting for him!

Today I made some progress on the dictionary of affect hmm. I finally got the feature vector written in the htk binary format (I had to change the reverse statement a little, so that it reverses only the words/bytes and not the whole vector):

$out .= reverse pack "f", $_ for map {@{$_}} @{$self->{currentDocument}->{observationSequence}};

I almost got the corpus partitioner working (partition into train and test sets), so that's what I'll work on tomorrow.

Monday, November 3, 2008

Monday, 3 November 2008

My productivity decreased the last week or so. Part of it was Rebeka's visit... not her fault but I didn't work that weekend. The next week she left on tuesday, then there was the election, so I've been reading the news a bit too much.

Last week I've did some reading on graphical models (lda in particular) and I watched this lecture http://videolectures.net/mlss07_ghahramani_grafm/ . I started programming a fuzzy logic tool/library in perl. I made some progress and I got a object oriented framework going, but I kind of let it slip in the past couple of days. I also started looking at hmm's for the emotion text classification (using DAL features). At first I started programming hmm's in perl, but then I decided to go back to htk and use that. The whole thing with continuous density hmms seem to be a lot to implement if it's not necessary. Trying to implement them was interesting b/c I didn't study the details last time I read the htk book.

I wasted some time backing up the tball data on the snap server, but now that I figured out how to use it it will be convenient having it mounted under linux. Also I wasted some time getting my earphones working (usb w/ linux and alsa)

This coming week will be busy. I'm hoping to volunteer at the Obama campaign b/c I'm getting nervous. Besides that I also have a lot on my plate to finish:
  1. hmm study: get htk to estimate the hmm params for the iemocap text w/ dal features
  2. latent dirichlet allocation: find a took kit and apply it to the iemocap text w/ pure text features
  3. program fuzzy logic tool kit in perl. Ideally I can get a type-1 version ready by the end of the week.
  4. program a gui for emotional interaction.
It's a lot to do but I need to get back on track for the dissertation proposal.

I wasted a bunch of time trying to figure how to write the htk parameter files from perl. After a lot of playing around with pack/unpack I finally figured it out

$out = pack "NNnn", $nSamples, $sampPeriod, $sampSize, $parmKind;

Where N/n is network order (big endian) long/short. For writing the data vector,

$out .= reverse pack "f*", map {@{$_}} @{$self->{currentDocument}->{observationSequence}};