HOME ABOUT JUDE CONTACT

Text Analysis Classifier

Predicts whether text sounds like it is written by a male or female, positive or negative, and age. Try it out!


This works best typing between 50 and 100 words:

Word count: 0
Predictions and Certainties:
Gender:
f:50%
m:50%

Tone:
-:50%
+:50%

Age:


How does it work? What does it do?

I built this classifier using a deep neural network with word embeddings. The word embeddings are averaged together in a weighted average and the neural network has 2 layers. It has been quite a journey to get this classifier to work and I have learned a lot from it. It is trained off the the Blog Authorship Corpus. The main issue I was having when getting this classifier to work effectively is that my computer is not powerful enough to run through the whole dataset, so I had to break down the data into a small sample, then I only took the first 100 words of blog entries that were over 100 words in length and I skipped over blog entries that didn't have at least 50 words. I also made sure I had an approximately 50% male and 50% female training dataset. This reduced the amount of hyper parameters down to a point where I could train an effective model, but in the future I am going to use Microsoft Azure to train a Deep Averaging Neural Network on the whole Blog Authorship Corpus. Then I plan on expanding this project to be a chrome extension that people can use to check their email and other text on the internet.