This is an old revision of the document!
Table of Contents
Voluntary Associations
This is the main page for organizing the Fall 2011/Winter 2012 CSCI403: Automatically Gathering Inscriptions from Multiple Online Sources
- David Margolies, Winter 2012 Lead Developer
- Riley Jordan, Fall 2011 Lead Developer
Motivation
Many DBs with inscriptions, but they're not centralized and easily searchable
Goals
- Automatically collect data from multiple online sources
- Create a DB for the inscriptions
- Create an interface to make data easily editable, exportable
Specifications
Collegium Project Proposal - a vision for the project
Tentative Schedule
Week 1: Jan 9
- Familiarize yourself with project
- Read the project proposal (under Specifications)
- Mostly for motivation and problem; Loose description of project
-
- Click Search under Epigraphic Text Database
- Click Simple-search
- Enter “collegi” under “String 1” in part D
- You'll get all inscriptions that contain “collegi”
Week 2: Jan 17
- Learn Python: (First 4-5 weeks of CSCI 111 - I/O, basic assignments, arithmetic, using OO, for loops, if statements, conditionals, string operations)
Week 3: Jan 23
- Setting the locale for extracting data
- DB: Postgres 8.2 or 8.4 –> Does that matter?
- Creating a Postgresql DB
- Connecting to DB using Python
- Encoding
- Learn Python: Next 4-5 weeks of CSCI111
- Lists
- Files
- Defining Functions, modules
- while loops
- dictionaries
Week 4: Jan 30
- Learn Python: Defining classes
Week 5: Feb 6
- Extracting data from Heidelberg
- Possible to simplify script?
- Get data in English?
Week 6: Feb 13
- Storing data
- Create DB
Week 7: Feb 27
- Learn new web framework
Week 8: Mar 5
- Create prototype of web interface
- Allow users to edit data? Annotate data?
- Extract data?
- Allow users to mine data?
- Maintain application code in a version control system
Week 9: Mar 12
- Refine interface
- Use Ajax, JQuery to
Week 10: Mar 19
- Deploy application; test
Week 11: Mar 26
- Modify based on feedback
- More testing
Week 12: Apr 2
- Finalize implementation
Final
- Submit code
- Documentation for how to run, maintain code
Resources
Corpora
Examples
Notes
- Search in Heidelberg DB for “collegi”
- Search in Clauss DB for “societ”
Extension Ideas
- Can we automatically classify any of the inscriptions with their trades, deities, types of association?
- Can we automatically generate any of the drop down lists?