Monday, November 17, 2008

Monday, 17 November 2008

I got the HMM w/ DAL features working. Actually it uses GMM's (Single state hmms) with a bigram language model. It gets about 30% accuracy with 9 emotional categories. I need to try
  • using different categories,
  • using more mixtures, and
  • also seeing if I can use higher order n-grams.

I switched to testing out latent dirichlet allocation for now. To get that working, I need to
  • normalize the text,
  • convert the text vocabulary items to integers
  • encode the text in each document (here, utterances) as the counts of each vocabulary item (represented as integers).
  • figure out the lda software
last week I improved slightly improved my productivity except one day I didn't do any research due to Ema experiments. I'm averaging ~6.5 hours of research time, if I don't include the day that I didn't work. Including that day puts me to only about 5 hrs per day on average.

Now that I 'm using my bike instead of my car, I'm going to have to adapt a little.

Wednesday, November 5, 2008

Wednesday, 5 November 2008

Thank you God for president-elect Barack Obama! Thank you humanity for voting for him!

Today I made some progress on the dictionary of affect hmm. I finally got the feature vector written in the htk binary format (I had to change the reverse statement a little, so that it reverses only the words/bytes and not the whole vector):

$out .= reverse pack "f", $_ for map {@{$_}} @{$self->{currentDocument}->{observationSequence}};

I almost got the corpus partitioner working (partition into train and test sets), so that's what I'll work on tomorrow.

Monday, November 3, 2008

Monday, 3 November 2008

My productivity decreased the last week or so. Part of it was Rebeka's visit... not her fault but I didn't work that weekend. The next week she left on tuesday, then there was the election, so I've been reading the news a bit too much.

Last week I've did some reading on graphical models (lda in particular) and I watched this lecture http://videolectures.net/mlss07_ghahramani_grafm/ . I started programming a fuzzy logic tool/library in perl. I made some progress and I got a object oriented framework going, but I kind of let it slip in the past couple of days. I also started looking at hmm's for the emotion text classification (using DAL features). At first I started programming hmm's in perl, but then I decided to go back to htk and use that. The whole thing with continuous density hmms seem to be a lot to implement if it's not necessary. Trying to implement them was interesting b/c I didn't study the details last time I read the htk book.

I wasted some time backing up the tball data on the snap server, but now that I figured out how to use it it will be convenient having it mounted under linux. Also I wasted some time getting my earphones working (usb w/ linux and alsa)

This coming week will be busy. I'm hoping to volunteer at the Obama campaign b/c I'm getting nervous. Besides that I also have a lot on my plate to finish:
  1. hmm study: get htk to estimate the hmm params for the iemocap text w/ dal features
  2. latent dirichlet allocation: find a took kit and apply it to the iemocap text w/ pure text features
  3. program fuzzy logic tool kit in perl. Ideally I can get a type-1 version ready by the end of the week.
  4. program a gui for emotional interaction.
It's a lot to do but I need to get back on track for the dissertation proposal.

I wasted a bunch of time trying to figure how to write the htk parameter files from perl. After a lot of playing around with pack/unpack I finally figured it out

$out = pack "NNnn", $nSamples, $sampPeriod, $sampSize, $parmKind;

Where N/n is network order (big endian) long/short. For writing the data vector,

$out .= reverse pack "f*", map {@{$_}} @{$self->{currentDocument}->{observationSequence}};

Tuesday, October 21, 2008

Tuesday, 21 October 2008

Mainly I was working on the IEMOCAP text analysis. I made some various plots to visualize the DAL features (min, max, and avg for activation, valence, and possibly imagery). I realized that the oov approach I was taking was not good for several reasons. For one, the average (neutral?) value for activation was lower than the midpoint, so using the midpoint as a default actually skews the data. Secondly, in the future I'd like to predict unknown values from text. Having it use a default should be done explicitly, and separately from assigning the values for known vocabulary.

In the morning, I read a little bit from the algebra book about subgroups. There was part toward the end of the section that I didn't understand well, so I'll have to go back and look at it again.

Monday, 20 October 2008

I spent a lovely 3 hours at the LA Metropolitan Courthouse. I got a ticket for registration last spring, registered my car, but then I forgot that I had to show up in court. So for that, I got to pay $200 and waste 3 hours. There was some collections agency calling me trying to get $800+ but they seemed sketchy, so I opted to get another court date instead of paying. They didn't tell me it would be $600 less, so I guess that is worth 3 hours of my time.

Besides that, I mainly worked on my starting a draft for a user interface paper (~4hrs), preparing a list of sessions and papers from interspeech that would be of interest to the emotion group (1.5 hrs), and working on IEMOCAP text analysis (~4hrs).

For the interface paper what I still need to do is incorporate some discussion of fuzzy logic, and then actually create the interface that I 'm proposing.

For the IEMOCAP text work, I was looking at how much of the corpus is covered by the dictionary of affect. It looks good but there is more preprocessing I can do to bring the coverage up.

TODO: create interface, discuss fuzzy logic in the interface paper, improve preprocessing for IEMOCAP (maybe make it a subclass of a more general routine)

Saturday, October 18, 2008

Saturday, 18 October 2008

Today started off taking it easy. Panchi and I went out for some coffee and talked about work, research, and some logic/algebra/number theory. Then I was reading the news and about multipoint /multitouch interfaces. I finished reading Jakob Nielsen's paper about next gen user interfaces . The dimensions that Nielsen predicts will characterize next generation interfaces (or, non-command interfaces, as he calls them), are:

User focus
Computers role
Interface control
Syntax
Object visibility
Interaction Stream
Bandwidth
Tracking feedback
Turn-taking
Interface locus
User programming
Software packaging

Friday, 17 October 2008

Today most of my work was in trying to program Qt/clam and reading Xavier's thesis. I didn't get to far with Qt/clam b/c I got stuck compiling and installing new qt libraries. it was a little bit of a pain b/c some of the packages were masked with gentoo. Xavier's thesis is nice to read and gives me a good feel for getting the flow and organization of a dissertation and also to read about the differences between systems, models, frameworks, patterns, etc.

The main things that I've been programming with qt are simple dialogs from various examples. What I need to do is to get sound recording to work and then to hook up a recognizer.

I spent a total of ~5 hours working. My main loss of time was getting up in the morning and I stopped working early b/c I had a head ache.

For fun stuff, I went out to eat w/ Markus, Panchi, and Karla at Sofra and then we went to bamboo for Mojitos and Caipirinhas.