ELDP Remote Fieldwork workshop series recap

Leading up to the deadline for the Endangered Languages Documentation Programme’s (ELDP) 2022 grant round, ELDP organised a series of workshops on remote fieldwork. The sessions took place from Monday 13 September 2021 and Friday 17 September 2021, and were based on a series of forthcoming papers in LD&C. All five workshops were recorded and are available on ELAR’s Vimeo channel.

Establishing new relationships in online language work – Karolina Grzech and Selena Tisalema Shaca

In the first workshop, Karolina Grzech and Selena Tisalema Shaca discussed „Establishing new relationships in online language work“. The workflow described by them was developed when Karolina’s ELDP project „Endangered oral traditions in the Andes and the Amazon: collaborative documentation of two Quechuan languages of Ecuador“ had to move from in situ to remote fieldwork due to the COVID-19 pandemic.

The session described and problematised a language documentation workflow based entirely on online conferencing software, Zoom. In this workflow a linguist, external to the community, establishes a new project together with a native speaker community member. Karolina and Selena applied this workflow in their joint project, which started in 2020 and focused on Tungurahua Kichwa, a Quechuan language spoken in the Ecuadorian Highlands.

What is particular about the discussed workflow is that the linguist and the native speaker did not know each other beforehand. Participants were walked through the steps they both undertook to launch a new and successful collaboration while only communicating online. The issues followed were:

• Prerequisites for setting up a new project online
• Getting to know each other: issue of trust in online work with strangers
• The project’s workflow and practicalities
• Online training
• Data sharing
• Combining online and in-situ fieldwork

The workshop’s aim was to account for all the steps of the project’s workflow in detail allowing for their replication. The session furthermore offered a critical appraisal of this workflow from the perspective of both the native speaker and the researcher.

The workshop took place in English and Spanish and is available here:

YouTube as a transcription and translation tool – Alexander Rice

The second workshop was led by Alexander Rice and introduced participants to „YouTube as a transcription and translation tool“.
During the session, participants learnt how to use YouTube as a transcription and translation tool and how to incorporate it into a remote language documentation workflow. YouTube provides an easy way in which documentation teams can do corpus work allowing to create accessible annotations with communities on a remote basis. This can be useful when travel restrictions and other pandemic-related events may prevent researchers and community members from working together in person. Using a short sample video, participants practiced the following tasks:

• Transcribing and translating in YouTube and creating subtitle files
• Exporting subtitle files from YouTube and importing them to ELAN as annotations
• Exporting intonational unit annotations from ELAN and importing them to YouTube as subtitles
Apart from the practical part, appropriate contexts in which YouTube should be used for language documentation work and recommended practices regarding online data management and privacy were discussed during the workshop.

In the discussion that ensued during the workshop, some improvements were made to the workflow, such as using ELAN’s „Label and number annotations“ function to facilitate importing segmented empty annotations into Youtube.
The workshop took place in English and is available here:

Redesigning language documentation projects during the COVID-19 pandemic – Katherine Bolaños, Jakob Lesage, and Sheena Shah

The third session was about „Redesigning language documentation projects during the COVID-19 pandemic“, during which Katherine Bolaños, Jakob Lesage, and Sheena Shah shared insights about redesigning their ELDP funded documentation projects to accomodate remote fieldwork. The settings for the three projects differed in various ways, but all relied on pre-existing links with the language communities. Some of the issues discussed were:
• Reasons for doing remote fieldwork
• Social considerations of remote fieldwork
• Issues relating to equipment, training, and technology
• Consequences for budget and planning

The recording of the session which was conducted in English is available here:

Supporting linguistic data collection from afar: a mobile metadata system – Richard Griscom

In the fourth workshop of the series, Richard Griscom presented on „Supporting linguistic data collection from afar: a mobile metadata system“.

In the session a method for remotely supporting and monitoring a language documentation project conducted by speakers, community activists, or academic researchers, through use of a free and open source data collection platform called KoBoToolbox was described. The system is based on the creation of digital linguistic metadata with mobile devices linked to a secure central server, giving project leaders the ability to immediately access metadata as it is submitted, quickly generate summary reports and visualizations, and export metadata for further processing and archiving. The system is suitable for anyone who would like to integrate mobile metadata into a new or ongoing project and is able to provide the necessary training either remotely or in person.

In preparation for the workshop, participants created a free account at kobotoolbox.org and worked with the latest version of Lameta.

The materials used during the workshop are available via these links:
Dropbox Data Folder – https://www.dropbox.com/sh/5y1dslj9t6fx4rd/AAATkpFFgQ_d9_6sFsByFfUva?dl=0
LingMetaX – https://colab.research.google.com/drive/149OpY8zxxSHA1u2deInzkegnUsEj1jiI#scrollTo=KVdTQ9sOmpLn&uniqifier=5
Participant Metadata Cleaner – https://colab.research.google.com/drive/1czhlqs5LErytTaeqNeD3T5UxlY41RFqr
Participant Names Checker – https://colab.research.google.com/drive/1X0fo1vJH3NnD50OucZLwqrc7GogMxBA5
Session ID and Filename Checker – https://colab.research.google.com/drive/1nuKRGzfzC8glK9zPW1u0JM274rtmpUrQ
Speaker Networks Script (and others) – https://colab.research.google.com/drive/1CW0nPfMf4bO27mr7miPR8FVdm4QRDmMP#scrollTo=98aOFH-HUec0
ICLDC Presentation – https://youtu.be/FlgA6-ghP4A

The recording of the session which was conducted in English is available here:

Methods and data management tools for using WhatsApp for language work – Kelsey Neely

The final workshop was conducted by Kelsey Neely discussing „Methods and data management tools for using WhatsApp for language work“.

The session demonstrated methods, tools and workflows that can be used for doing language work via WhatsApp. There are many ways that WhatsApp and similar messaging technologies can be applied to language work, but the talk specifically focused on the use of voice messages as the primary means of carrying out common linguistic research tasks like elicitation and text analysis. Voice messages are an excellent option for these types of tasks, especially in contexts where internet connections are too slow or intermittent for real-time audio or video calls and where there is not a standard orthography that speakers are confident in using.

The workshop covered the following topics:

• The pros and cons of this method and some possible workflows for different types of elicitation
• How to save and organize WhatsApp audio files and how to re-encode, re-name, and concatenate them using Python and Ffmpeg
• How to create a pre-segmented .eaf based on clip duration using Python and Ffmpeg
• A possible workflow for text annotation (e.g., careful-respeakings and translations)
• A sample script for using an .eaf annotation from ELAN to clips that can be used in WhatsApp

To follow the workshop, participants were asked to install Python 3 or higher and Ffmpeg, as well as WhatsApp on both a laptop and a smartphone. The scripts used during the workshop were the following:

The recording of the workshop is available here:

Kelsey has made her ffmpeg workflow and commands available here:

