This is an old revision of the document!


Index Generation

Feature Description

The indices will be an alphabetized list of words featured in the graffiti with a count of appearances of the word in the database graffiti and a link to a page with more information about the graffiti that feature the word. The information on the word page will either be links to the graffiti that feature the word or the information about and photos of the graffiti directly. Additionally, we hope to implement a jump-to feature that allows the user to jump directly to a letter in the index rather than scrolling all the way to the word they want to see. The word index will need to disregard other tokens, such as roman numerals and other stray, non-semantic lettering that may be present in the data. It is our goal for the indices to contain lemmata instead of having individual entries for each conjugated form of a verb or declined form of a noun. We plan also to implement similar indices for person and place names attested in the graffiti.

Prerequisites

  • Understanding how the data for each graffito is stored so that we can discuss how to process and display that information
  • Determining which data structures will best realize our goals
  • Developing a system that not only indexes current data but also automatically processes new data when it is uploaded
  • Understanding of the search APIs

What the user sees

The user should be able to access a page containing the indices. This page should consist of a list that is organized alphabetically by value. Data concerning the frequency with which a word appears and a link to a page dedicated to that particular word will be available on the index page. The word page will contain a list of each instance of the word and where it occurs (using the findspot), summary statistics, and possibly additional data visualizations. Each entry in the list will link to the summary page for the individual graffito. Each summary page will be broken up into pages of 10 entries each to help the user more easily navigate words that occur often.

User behavior

The user should be able to access our pages through both a link in the title bar right next to the “About the Project” link with link text “Indices” and through the search results. The summary page for a particular value should behave similarly to a search result but should be more robust in its presentation of the data. It is worth taking into consideration the ease with which the user should be able to find information, so maybe the information should be present but collapsed in search results but expanded in its summary page.

Use Cases

If an individual is attempting to research a particular term this would be very useful. The indices would provide easy standardized access to data that could be used for further research or data visualizations. It is worth asking the client what types of research she envisions this tool to be used for so that we can have think about what types of information would be helpful to display on a word's summary page.

Relative Priority of Feature

This feature is a higher priority as it adds significant functionality to the website and requires a fair amount of structural work. Statistics and visualizations on the word pages are additional features that we hope to implement but are a lower priority than the index pages and the word pages to which they link.

courses/cs335/spring2019/graffiti/index.1556674160.txt.gz · Last modified: 2019/05/01 01:29 by holmesr
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0