Lately I've been working on looking for patterns in words in resources for web applications. We're looking at what the most common words are in each application, what part of speech they are, patterns in the parts of speech for the whole resource. One problem I've been running into is that there are a lot of words that are hard to classify as nouns, adjectives or verbs without their context - for example grade can be a noun or a verb, abstract can be an adjective or a noun, etc. etc. etc. Usually with the context of the application or the rest of the resource, it's pretty easy to figure out what part of speech the words actually are, but there are some we're still not sure about (login and logout are the most notable examples of that). So the next thing I'll be doing is going into our automatically generated files and manually updating the parts of speech based in the specific contexts of the resources and keeping track of how many changes have to be made for each application (to determine how important it would be to automate that process. We also want to see whether or not this really gives us valuable insight about how people navigate through the applications and how we might be able to group similar sets of pages.

