Infrastructure

From OpenCellID wiki
Revision as of 16:22, 14 January 2014 by Msemm (Talk | contribs)

Jump to: navigation, search

Servers

OpenCellID server strukture.PNG
Server Software Operating system Resources
prod-ocid-web-01.colt.enaikoon.de Apache + Tomcat + MongoS Ubuntu 12.04 LTS 2 vCPU, 4 GB
prod-ocid-web-02.colt.enaikoon.de Apache + Tomcat + MongoS Ubuntu 12.04 LTS 2 vCPU, 4 GB
prod-ocid-cfgsrv-01.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-web-02.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-web-03.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-db-01.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB
prod-ocid-db-02.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB
prod-ocid-db-03.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB

Software stack

Operating System

All OpenCellID servers are running with Ubuntu Linux 12.04 LTS.

Frontend

  • The web frontend uses Apache web server as a proxy for serving web requests to Tomcat.
  • The OpenCellID web application is running on Tomcat and is reading and writing cell measurements data to/from the MongoDB database backend.
  • jQuery Mobile is responsible for providing a cross-platform user interface.
  • The map is displayed using OpenStreetMap combined with Leaflet library.

Database Backend

  • The database backend, with a current 4.4 million cell towers and about 565 million measurements (1.1.2014), is a MongoDB database cluster with six servers:
    • Three servers are serving as MongoDB configuration servers
    • The other three servers are serving as database backend with one replication set spread across the three servers

Challenges and solutions

The OpenCellID community is very strong and continously provides a high number of measurements.
This immediately poses a few challenges:

  • High Volume
    data arrives from many differnet sources and is rapidly growing
  • Scale
    growth of data should go along with predictable, incremental costs and no downtime should be needed when adding additional server resources
  • Data Processing
    analyzing and processing of rapidly growing data must be constantly efficient.

The current solutions are based on MongoDB and its features:

  • Native Analytics
    using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems
  • Advanced Geo Queries
    using geospatial MongoDB support to execute complex queries
  • Horizontal Scaling
    sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput
  • Reduced Total Cost of Ownership (TCO)
    as open-source storage MongoDB is a very cost-effective solution