Difference between revisions of "Infrastructure"

From OpenCellID wiki
Jump to: navigation, search
(Database Backend)
(Challenges and solutions)
Line 73: Line 73:
 
* High Volume<br>data arrives from many differnet sources and is rapidly growing
 
* High Volume<br>data arrives from many differnet sources and is rapidly growing
 
* Scale<br>growth of data should go along with predictable, incremental costs and no downtime should be needed when adding additional server resources
 
* Scale<br>growth of data should go along with predictable, incremental costs and no downtime should be needed when adding additional server resources
* Data Processing<br>analyzing and processing of rapidly growing data must be constantly efficient.<br>The current solutions are based on MongoDB and its features:<br>
+
* Data Processing<br>analyzing and processing of rapidly growing data must be constantly efficient.
** Native Analytics<br>using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems
+
 
** Advanced Geo Queries<br>using geospatial MongoDB support to execute complex queries
+
The current solutions are based on MongoDB and its features:<br>
** Horizontal Scaling<br>sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput
+
* Native Analytics<br>using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems
** Reduced Total Cost of Ownership (TCO)<br>as open-source storage MongoDB is a very cost-effective solution
+
* Advanced Geo Queries<br>using geospatial MongoDB support to execute complex queries
 +
* Horizontal Scaling<br>sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput
 +
* Reduced Total Cost of Ownership (TCO)<br>as open-source storage MongoDB is a very cost-effective solution

Revision as of 16:22, 14 January 2014

Servers

OpenCellID server strukture.PNG
Server Software Operating system Resources
prod-ocid-web-01.colt.enaikoon.de Apache + Tomcat + MongoS Ubuntu 12.04 LTS 2 vCPU, 4 GB
prod-ocid-web-02.colt.enaikoon.de Apache + Tomcat + MongoS Ubuntu 12.04 LTS 2 vCPU, 4 GB
prod-ocid-cfgsrv-01.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-web-02.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-web-03.colt.enaikoon.de MongoDB ConfigServer Ubuntu 12.04 LTS 1 vCPU, 2 GB
prod-ocid-db-01.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB
prod-ocid-db-02.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB
prod-ocid-db-03.colt.enaikoon.de MongoDB Replication Set Ubuntu 12.04 LTS 4 vCPU, 48 GB

Software stack

Operating System

All OpenCellID servers are running with Ubuntu Linux 12.04 LTS.

Frontend

  • The web frontend uses Apache web server as a proxy for serving web requests to Tomcat.
  • The OpenCellID web application is running on Tomcat and is reading and writing cell measurements data to/from the MongoDB database backend.
  • jQuery Mobile is responsible for providing a cross-platform user interface.
  • The map is displayed using OpenStreetMap combined with Leaflet library.

Database Backend

  • The database backend, with a current 4.4 million cell towers and about 565 million measurements (1.1.2014), is a MongoDB database cluster with six servers:
    • Three servers are serving as MongoDB configuration servers
    • The other three servers are serving as database backend with one replication set spread across the three servers

Challenges and solutions

The OpenCellID community is very strong and continously provides a high number of measurements.
This immediately poses a few challenges:

  • High Volume
    data arrives from many differnet sources and is rapidly growing
  • Scale
    growth of data should go along with predictable, incremental costs and no downtime should be needed when adding additional server resources
  • Data Processing
    analyzing and processing of rapidly growing data must be constantly efficient.

The current solutions are based on MongoDB and its features:

  • Native Analytics
    using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems
  • Advanced Geo Queries
    using geospatial MongoDB support to execute complex queries
  • Horizontal Scaling
    sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput
  • Reduced Total Cost of Ownership (TCO)
    as open-source storage MongoDB is a very cost-effective solution