TECHNOLOGY AND INFRASTRUCTURE

We use open standards and resilient infrastructure to improve the sustainability of our work over the long term.

Standards

All our work uses open, non-proprietary data standards and file formats where ever possible, in order to help improve the sustainability of our work over the longer term and ensure that everything we produce will be unaffected by technological change. We also favour making our data available for re-use by third parties via Web APIs and data download links.

Architecture

Our system architecture is typically PHP based, but also Python, running on Apache and pulling data from XML files, MySQL databases and Lucene indexes, as appropriate.

Our website front-ends are built using React JS as well as PHP (Symfony framework), Java, Bootstrap, XHTML and CSS, incorporating Ajax-style features where appropriate. For graphical modelling, data visualisation and mapping we use JS libraries such as 3D.js, node.js, and leaflet.js. For 3D modelling (e.g. http://www.markmybird.org) we use WebGL.

For large scale computational tasks that require distributed computing power we use our University’s own Hadoop cluster with Apache Hive, in addition Amazon AWS.

Our mobile applications are developed as mobile web applications where appropriate (typically when the app is required to simply pull down and display data) because of the sustainability and platform agnosticism of this approach, whilst more sophisticated apps are developed in C or Java and then compiled for specific mobile operating systems.

Our research resources are often created by providing project teams with access to a Content Management System (CMS), specifically designed around the needs of their project, so that data can be created and managed centrally, irrespective of the researcher’s location, and previewed live using the evolving end-user interface. The CMS also enables researchers to continue creating and managing their data long after the funded period of their project has ceased. We currently build these using PHP Sonata Admin.

Infrastructure

Our servers are dedicated virtual machines (VMs) running Ubuntu, hosted and maintained by the University’s computing services. Our server family consists of development servers for internal development and testing of online resources, production servers for hosting publicly accessible resources over the longer term, and database servers for hosting the databases that underpin the resources on the production servers. We currently run two development servers, six production servers, and two database servers. All data assets, such as primary source datasets are stored on our dedicated storage area network (SAN) which functions as a data repository (for long term management of data) and a data server (serving data to the database and production servers). All our servers are housed in dedicated data centres.

We use a dedicated source code repository in Github which enables the development team to store, track, version control and deploy code changes. We use penetration testing software such as sqlmap (http://sqlmap.org) and OWASP ZAP (https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project) to test our code for vulnerabilities, and follow OWASP security recommendations and best practice. We have our own Service Management System for managing information about the profile, status, and hosting of development projects. The online project management system Basecamp (more frequently: Google Team Drive) is used for managing and sharing the non-technical aspects of a project (e.g. project documentation, milestones, and task lists).