We are looking for a seasoned database/infrastructure group (or person) to help us rebuild our existing infrastructure.
The Basics:
• 2-3 TB of new data every day (uncompressed).
• Tens of billions of analytic events created per day. An event means somebody shook, turned off, rotated, etc. the app.
• Over 700 billion analytics events already processed.
• During typical MapReduce jobs the can crunch on over 500 thousand events per second
• Data is often available in the analytics platform within 30 minutes of coming in from mobile devices
• Approximately 30 servers total.
The current and/or Proposed Architecture:
• Ad Servers instances running on quad core x 16GB w/ 1TB
• Event Tracking Servers on 16 core x 12 GB w/ 10TB
• Event Processing, Job Execution, Log Collection & Aggregate Pre-Processing each 16 core 24GB RAM w/ 6TB of space
• Log Collection & Aggregate Pre-Processing and Post-Processing. Hadoop Cluster nodes each 16 core x 12GB RAM w/ 2x2TB (JBOD).
• Analytics varying configurations all with 16 cores from 12 to 96GB of RAM w/ 1-2TB