Abe's Thesis

Monday, November 17, 2008

Monday, 17 November 2008

I got the HMM w/ DAL features working. Actually it uses GMM's (Single state hmms) with a bigram language model. It gets about 30% accuracy with 9 emotional categories. I need to try

using different categories,
using more mixtures, and
also seeing if I can use higher order n-grams.

I switched to testing out latent dirichlet allocation for now. To get that working, I need to

normalize the text,
convert the text vocabulary items to integers
encode the text in each document (here, utterances) as the counts of each vocabulary item (represented as integers).
figure out the lda software

last week I improved slightly improved my productivity except one day I didn't do any research due to Ema experiments. I'm averaging ~6.5 hours of research time, if I don't include the day that I didn't work. Including that day puts me to only about 5 hrs per day on average.

Now that I 'm using my bike instead of my car, I'm going to have to adapt a little.

Wednesday, November 5, 2008

Wednesday, 5 November 2008

Thank you God for president-elect Barack Obama! Thank you humanity for voting for him!

Today I made some progress on the dictionary of affect hmm. I finally got the feature vector written in the htk binary format (I had to change the reverse statement a little, so that it reverses only the words/bytes and not the whole vector):

$out .= reverse pack "f", $_ for map {@{$_}} @{$self->{currentDocument}->{observationSequence}};

I almost got the corpus partitioner working (partition into train and test sets), so that's what I'll work on tomorrow.

Monday, November 3, 2008

Monday, 3 November 2008

My productivity decreased the last week or so. Part of it was Rebeka's visit... not her fault but I didn't work that weekend. The next week she left on tuesday, then there was the election, so I've been reading the news a bit too much.

Last week I've did some reading on graphical models (lda in particular) and I watched this lecture http://videolectures.net/mlss07_ghahramani_grafm/ . I started programming a fuzzy logic tool/library in perl. I made some progress and I got a object oriented framework going, but I kind of let it slip in the past couple of days. I also started looking at hmm's for the emotion text classification (using DAL features). At first I started programming hmm's in perl, but then I decided to go back to htk and use that. The whole thing with continuous density hmms seem to be a lot to implement if it's not necessary. Trying to implement them was interesting b/c I didn't study the details last time I read the htk book.

I wasted some time backing up the tball data on the snap server, but now that I figured out how to use it it will be convenient having it mounted under linux. Also I wasted some time getting my earphones working (usb w/ linux and alsa)

This coming week will be busy. I'm hoping to volunteer at the Obama campaign b/c I'm getting nervous. Besides that I also have a lot on my plate to finish:

hmm study: get htk to estimate the hmm params for the iemocap text w/ dal features
latent dirichlet allocation: find a took kit and apply it to the iemocap text w/ pure text features
program fuzzy logic tool kit in perl. Ideally I can get a type-1 version ready by the end of the week.
program a gui for emotional interaction.

It's a lot to do but I need to get back on track for the dissertation proposal.

I wasted a bunch of time trying to figure how to write the htk parameter files from perl. After a lot of playing around with pack/unpack I finally figured it out

$out = pack "NNnn", $nSamples, $sampPeriod, $sampSize, $parmKind;

Where N/n is network order (big endian) long/short. For writing the data vector,

$out .= reverse pack "f*", map {@{$_}} @{$self->{currentDocument}->{observationSequence}};

Tuesday, October 21, 2008

Tuesday, 21 October 2008

Mainly I was working on the IEMOCAP text analysis. I made some various plots to visualize the DAL features (min, max, and avg for activation, valence, and possibly imagery). I realized that the oov approach I was taking was not good for several reasons. For one, the average (neutral?) value for activation was lower than the midpoint, so using the midpoint as a default actually skews the data. Secondly, in the future I'd like to predict unknown values from text. Having it use a default should be done explicitly, and separately from assigning the values for known vocabulary.

In the morning, I read a little bit from the algebra book about subgroups. There was part toward the end of the section that I didn't understand well, so I'll have to go back and look at it again.

Monday, 20 October 2008

I spent a lovely 3 hours at the LA Metropolitan Courthouse. I got a ticket for registration last spring, registered my car, but then I forgot that I had to show up in court. So for that, I got to pay $200 and waste 3 hours. There was some collections agency calling me trying to get $800+ but they seemed sketchy, so I opted to get another court date instead of paying. They didn't tell me it would be $600 less, so I guess that is worth 3 hours of my time.

Besides that, I mainly worked on my starting a draft for a user interface paper (~4hrs), preparing a list of sessions and papers from interspeech that would be of interest to the emotion group (1.5 hrs), and working on IEMOCAP text analysis (~4hrs).

For the interface paper what I still need to do is incorporate some discussion of fuzzy logic, and then actually create the interface that I 'm proposing.

For the IEMOCAP text work, I was looking at how much of the corpus is covered by the dictionary of affect. It looks good but there is more preprocessing I can do to bring the coverage up.

TODO: create interface, discuss fuzzy logic in the interface paper, improve preprocessing for IEMOCAP (maybe make it a subclass of a more general routine)

Saturday, October 18, 2008

Saturday, 18 October 2008

Today started off taking it easy. Panchi and I went out for some coffee and talked about work, research, and some logic/algebra/number theory. Then I was reading the news and about multipoint /multitouch interfaces. I finished reading Jakob Nielsen's paper about next gen user interfaces . The dimensions that Nielsen predicts will characterize next generation interfaces (or, non-command interfaces, as he calls them), are:

User focus
Computers role
Interface control
Syntax
Object visibility
Interaction Stream
Bandwidth
Tracking feedback
Turn-taking
Interface locus
User programming
Software packaging

Friday, 17 October 2008

Today most of my work was in trying to program Qt/clam and reading Xavier's thesis. I didn't get to far with Qt/clam b/c I got stuck compiling and installing new qt libraries. it was a little bit of a pain b/c some of the packages were masked with gentoo. Xavier's thesis is nice to read and gives me a good feel for getting the flow and organization of a dissertation and also to read about the differences between systems, models, frameworks, patterns, etc.

The main things that I've been programming with qt are simple dialogs from various examples. What I need to do is to get sound recording to work and then to hook up a recognizer.

I spent a total of ~5 hours working. My main loss of time was getting up in the morning and I stopped working early b/c I had a head ache.

For fun stuff, I went out to eat w/ Markus, Panchi, and Karla at Sofra and then we went to bamboo for Mojitos and Caipirinhas.