Our objective was to provide end to end solution for the Client''s BIG DATA requirement.
1) Gather data from different sources.
2) Data storage.
3) Data processing.
4) Reports and Dashboard generation.
The major challenges that we faced in this project or any other big data project are -
1) Massive volume of data
2) A wide variety of data to be captured, integrated and analyzed
3) High velocity of transactions (data flow).
Owing to the complexity and criticality of the project, we used a very structured process to minimize production risks.
We started a pilot project in which we broke down this project into individual steps like capturing, storage, data churning and then data analysis.
We have ufolsed lowing tools -
Spring XD - Data capturing.
Pivotal Hadoop - Data Storage.
Pivotal HAWQ - Data Processing.
Greenplum - Store the processed data so that we can also has historical data.
Tableau - Report and dashboard generation.
Our Data Scientist in US worked hand-in-hand with our Data Engineering team to put the entire solution together.