Explore data model:

SARV: a database for geoscience collections

Scientific collections can be effectively utilised in research only when they are appropriately catalogued and the information is readily accessible. This is greatly facilitated by the use of electronic databases. In Estonia the first efforts in using electronic databases for collection management were made already in 1994 and a few years later the development of a custom database designed specifically for geological collections and related information started. This database, now known as SARV, has since then evolved from institutional desktop application into a relational client-server information system that is deployed in several institutions in Estonia. SARV aims to serve the needs of collection managers as well as of researchers seeking for information or willing to deposit and publish their data in a structural and easily searchable form.

The data model of SARV consists of more than a hundred related database tables. The server-side software of SARV is based on various open source components, such as the MySQL database server, PHP, Python, Django, OpenLayers. As of 2015 all data entry-entry and reporting interfaces as well as public data portals are web based and all functionality can be accessed through a browser.

SARV has a publicly accessible web interface at, where users can search for information related to individual collection objects, fossil species, stratigraphical terms, image files, etc. The data can be freely used for non-commercial purposes according to the Creative Commons license. SARV can also be accessed via BioCASe, GBIF, GeoCASe and OpenUp/Europeana networks.

As of 2015, nearly half of Estonian geological collections are electronically catalogued at the unit level. The majority of the important collections, type- and cited fossil specimens in particular, are already in the database. In addition to registration of physical collection objects, the system contains a growing amount of related information starting from digitised photo archives, scanned field notebooks, results of geochemical analyses, annotated taxonomic and stratigraphical dictionaries and so on. Most of that information is freely accessible online.

Chronological highlights of database development

  • 1994 - First databasing attempts of geological collections at TUG
  • 1996 - Start of digital cataloguing of geological specimens, using Lotus 1-2-3 and MS Excel at GIT
  • 1998 - Initial version of multi-table database based on MS Access 97
  • 2000 - MS Access-based multi-user networked database
  • 2002 - Deployment of MySQL database server software in MS Windows environment
    - First public website for accessing the collections database, included web map server
  • 2003 - Collaboration with Estonian Museum of Natural History (ELM), testing the same software and data model there
    - Joining BioCASE specimen-level data network as the first in Estonia
  • 2004 - First dedicated server hardware, migration to Red Hat Linux operating system
    - The name "SARV" was first used for the database
    - More functional public web portal for data access
  • 2005 - Adjusting the database structure and building web interfac for ELM
    - Migration to Debian Linux
  • 2006 - First attempts to deploy SARV in the Museum of Geology, University of Tartu
  • 2007 - Linking with Google Maps web map service, smart-card authentication for restricted web-based interface
  • 2008 - Common web portal for three institutional databases, updates of other web interfaces
    - Specific modules for taxonomy and fossil species and photo archives
  • 2010 - Joining the international GeoCASe data network
  • 2011 - Upgrade of server hardware, prototype of web-based thin client to replace legacy MS Access application
  • 2012 - First professional software developer hired under NATARC
  • 2013 - Partial virtualisation of server hardware using Linux/KVM; Special portal for analytical data
  • 2014 - Public database API
    - Joining DataCite consortium and first DOI identifiers assigned for datasets deposited in SARV
  • 2015 - Legacy desktop application replaced by web-based interfaces; data from multiple institutions full virtualisation of server hardware using VMware