Updating Archeological Site Field Data Collection Methods to Twenty First Century

In March of 2015, the Office of Contract Archeology (OCA) at the University of New Mexico began field testing various digital approaches to recording archeological site information. One critical piece of information for any archaeologist working in New Mexico is the Laboratory of Anthropology Site Record, simply called the LA Form. The LA Form is designed to record a variety of data about sites, including environmental impacts to the archaeological resource, site location and dimensions, and a description of activities performed during visits to the site.

In the past, OCA has typically carried blank copies of the LA Form to the field and recorded information using traditional pen and paper methods. This approach, while tried and true, requires additional processing time in the office since digital copies of the form are required for database processing and long-term data preservation. Consisting of eight pages with hundreds of different fields, converting LA Forms from paper to digital forms can be a tedious, time-consuming, and error-prone task.

Android tablet with protective case used by UNM OCA for data collection

Android tablet with protective case used by UNM OCA for data collection

To avoid this processing time, and to take the first steps towards a modern digital data collection workflow, OCA has purchased several 10” Android tablets along with protective cases and other equipment for the demands of fieldwork.

For digital form data collection software, the Android-based, open-source Open Data Kit (ODK) was identified as a flexible solution. ODK has been used extensively for remote field-based projects around the globe and, because of its large user community, it is well supported and highly customizable. One critical piece of data used throughout archaeology is geolocation and ODK (and a spinoff app named GeoODK) provides the ability to automatically record GPS location using the tablet’s internal GPS antenna. Alternatively, if improved accuracy is required (as can often be the case in archaeology), higher-end GPS systems, like Trimbles, can be linked to ODK via Bluetooth as a mock location provider.

The process to convert the LA Form (or any other conceivable form) to ODK requires knowledge of XML or, alternatively, can be designed in Microsoft Excel and exported to XML format. The steps are detailed below:

  • 1. Convert the form to a fillable PDF form.
  • 2. Set field names and types (e.g. checkbox, or text) within the PDF form.
  • 3. Within Excel, setup matching field names, types, labels, required fields, or other information (shown in image below).
  • 4. Convert the Excel spreadsheet to XML using ODK’s XLSForm.
  • 5. Load the blank XML form on the tablet and collect data using ODK.
A portion of the LA Form in Microsoft Excel format

A portion of the LA Form in Microsoft Excel format

Textual, numerical, geographic, and other data are then collected with ODK using the onscreen keyboard through a standard “swipe through” process. The completed form data are stored on the device for transfer in the office at a later date. On a recent five day back-country project, we found it prudent to manually backup these completed forms to an external SD Card device – you never know what can happen!

A portion of the LA Form as seen in ODK

A portion of the LA Form as seen in ODK

Once back in the office, forms are pushed to an ODK Aggregate database server which is backed up regularly. These data can be pushed to the server from a remote location (e.g. a hotel room), providing the ability for OCA project directors to analyze the results of long-term fieldwork projects on a daily basis. ODK data containing grid coordinates, like those collected during excavation, can be displayed through a web-based map that can provide OCA archaeologists with regular updates on the progress of excavation projects.

Of course, because ODK uses XML to store finished forms, the LA forms collected on the tablets can be difficult to read without converting back to their original format. A custom Python script was developed to convert these LA Forms to PDF format for viewing, printing, and archiving. Using the open-source Python libraries untangle and fdfgen, the XML files were converted to an Acrobat Forms Data Format (FDF) file. After defining field names in a blank PDF version of the LA Form, the FDF file can be applied to the form using PDFtk. Finally, after a minute or two of Python magic, you are left an editable PDF version of the ODK form collected in the field.

Python code using untangle and fdfgen to convert XML to FDF.

Python code using untangle and fdfgen to convert XML to FDF.

A sample of the PDF version of the final LA Form

A sample of the PDF version of the final LA Form

OCA is currently refining this process and digitizing a variety of forms, including photo logs and excavation records. More information, including Python conversion scripts, can be found on our Github repository at github.com/UNMOCA/ODKArchForms.

Scott Gunn
GIS Analyst
Office of Contract Archeology
University of New Mexico
sagunn@unm.edu
http://oca.unm.edu

Bookmark the permalink.

Comments are closed.