/ciol/media/media_files/2025/11/14/aws-to-digitize-jane-goodall-2025-11-14-11-58-23.png)
AWS will fund a $1 million project to convert six decades of the Jane Goodall Institute’s chimpanzee and baboon research—handwritten notes, film and observational logs—into a searchable, multimodal digital archive using Amazon Bedrock, SageMaker and partner-led UX design. The work aims to preserve irreplaceable data and make it accessible to researchers worldwide.
For research efforts that span generations, paper and film are liabilities: they deteriorate, stay siloed, and remain hard to query. AWS’s commitment to digitize JGI’s archives addresses both preservation and discovery. By converting handwritten notes and historic footage into structured, searchable datasets, scientists gain the ability to ask new, cross-decade research questions—for example, linking behavioural observations with satellite-derived habitat change or seasonal soundscape patterns.
How the project will work: cloud, multimodal AI and UX
AWS will partner with design and research specialist Ode to build the user experience while using Amazon Bedrock and Amazon SageMaker to process multimodal inputs. The pipeline has three visible stages:
Digitization: High-resolution scanning and secure cloud ingestion of handwritten notes, films and audio.
Multimodal indexing: OCR and handwriting recognition for field notes; frame-by-frame analysis for film; audio transcription and embedding-based search across text, image, and video.
Accessible portal: A researcher-facing portal enabling natural-language queries, cross-referencing, GIS overlays and exportable datasets.
Taimur Rashid of AWS framed the work as a blend of embedding models, prompt engineering and multimodal LLMs that will “unlock new possibilities through AI-powered analysis.”
JGI researchers expect immediate, pragmatic benefits: faster literature review across decades; the ability to correlate individual animal behaviour with environmental events; and improved public-facing education assets. The project builds on an AWS proof-of-concept and intends to surface examples where long-buried observational patterns might inform conservation priorities today.
Dr Lilian Pintea of JGI highlighted that digitisation will “amplify JGI’s mission and create a digital legacy that ensures Dr. Goodall’s pioneering work continues to inspire and guide future generations.”
Risks, governance and scientific integrity
Digitisation at scale raises important questions that the project will need to manage:
Data provenance: Handwritten notes and film must retain contextual metadata (dates, observers, camera settings) so automated analysis doesn’t strip essential context.
Model confidence and error rates: Handwriting recognition and multimodal tagging must include quality flags and manual review workflows for high-stakes research use.
Access and equity: A public portal should balance open research with the Institute’s ethical responsibilities toward communities and sensitive site data.
Sustainability: Long-term maintenance, refresh cycles, and funding beyond the initial $1M commitment are crucial to prevent a new form of archival rot—this time digital.
Short-term indicators will include the percentage of the archive successfully digitized, retrieval accuracy for natural-language queries, and researcher adoption metrics. Long-term success will be measured by new cross-disciplinary studies enabled by the archive, policy or conservation outcomes informed by the data, and scalable community access to the platform.
This project is an example of how enterprise AI tooling—cloud compute, multimodal models and scalable data engineering—can preserve cultural and scientific capital. It also illustrates an emerging role for corporate generative AI funds in supporting public-interest data infrastructure. If implemented with rigorous provenance, governance and community input, the platform could become a model for similar archival recoveries in ecology, anthropology and historical science.
/ciol/media/agency_attachments/c0E28gS06GM3VmrXNw5G.png)
Follow Us