Difference between revisions of "Infrastructure"
(→The brain) |
|||
(4 intermediate revisions by 2 users not shown) | |||
Line 64: | Line 64: | ||
===Database Backend=== | ===Database Backend=== | ||
− | The database backend, with a current | + | The OpenCellID backend uses Kafka queuing system in order to be able to handle periodic peaks. Kafka producers embedded into the web application send all incoming data to Kafka brokers. Kafka consumers pull data from brokers, process measurements and store them in MongoDB. |
− | *three servers are serving as MongoDB configuration servers | + | |
− | *the other three servers are serving as the database backend with one replication set spread across the three servers | + | The database backend, with a current 7 million cell towers and about 1.2 billion measurements (1.1.2015), is a MongoDB database cluster with six servers: |
+ | *three servers are serving as MongoDB configuration servers and Zookeeper instances | ||
+ | *the other three servers are serving as the database backend with one replication set spread across the three servers and Kafka brokers | ||
==Challenges and solutions== | ==Challenges and solutions== | ||
Line 75: | Line 77: | ||
*Data Processing<br>the analysis and process of the rapidly growing data must be constantly efficient | *Data Processing<br>the analysis and process of the rapidly growing data must be constantly efficient | ||
− | The current solutions are based on | + | The current solutions are based on Kafka queuing system and MongoDB with its features:<br> |
* Native Analytics<br>using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems | * Native Analytics<br>using the integrated aggregation framework and Map/Reduce to calculate aggregates and analyses in place without the need of prior exporting data to other systems | ||
* Advanced Geo Queries<br>using geospatial MongoDB support to execute complex queries | * Advanced Geo Queries<br>using geospatial MongoDB support to execute complex queries | ||
* Horizontal Scaling<br>sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput | * Horizontal Scaling<br>sharding makes it easy to scale applications horizontally on commodity hardware for accommodating constantly increased throughput | ||
− | * Reduced Total Cost of Ownership (TCO)<br>as open-source storage MongoDB | + | * Reduced Total Cost of Ownership (TCO)<br>as open-source storage MongoDB and Kafka queuing system are a very cost-effective solution |
==The brain== | ==The brain== | ||
− | Krzysztof Ociepa (email: [email protected]) has designed the big-data infrastructure as well as the new OpenCellID server software based on Java and MongoDB, and has also implemented most of the current features after two other developers failed to do so. | + | Krzysztof Ociepa (email: [email protected]) has designed the big-data infrastructure as well as the new OpenCellID server software based on Java, Kafka queuing system and MongoDB, and has also implemented most of the current features after two other developers failed to do so. |
Details about the implemented software and infrastructure can be found above. | Details about the implemented software and infrastructure can be found above. | ||
− | There are plans to publish the entire server software as open source for stimulating the contribution of software features of other community members of the OpenCellID project. This will most likely happen before the end of | + | There are plans to publish the entire server software as open source for stimulating the contribution of software features of other community members of the OpenCellID project. This will most likely happen before the end of 2015. |
|} | |} |
Latest revision as of 19:02, 3 January 2015
ContentsServers
Software stackOperating SystemAll OpenCellID servers are running with Ubuntu Linux 12.04 LTS. Frontend
Database BackendThe OpenCellID backend uses Kafka queuing system in order to be able to handle periodic peaks. Kafka producers embedded into the web application send all incoming data to Kafka brokers. Kafka consumers pull data from brokers, process measurements and store them in MongoDB. The database backend, with a current 7 million cell towers and about 1.2 billion measurements (1.1.2015), is a MongoDB database cluster with six servers:
Challenges and solutionsThe OpenCellID community is very strong and continuously provides a high number of measurements.
The current solutions are based on Kafka queuing system and MongoDB with its features:
The brainKrzysztof Ociepa (email: [email protected]) has designed the big-data infrastructure as well as the new OpenCellID server software based on Java, Kafka queuing system and MongoDB, and has also implemented most of the current features after two other developers failed to do so. Details about the implemented software and infrastructure can be found above. There are plans to publish the entire server software as open source for stimulating the contribution of software features of other community members of the OpenCellID project. This will most likely happen before the end of 2015. |