Senior Data Engineer
- Identify and design data requirements and access patterns for big data initiatives.
- Identify and integrate data processing and reporting frameworks using Hadoop and MapReduce.
- Implement near real-time ETL process to handle hundreds of data sources using Spark and Kafka technology.
- Monitor performance and advise of any necessary infrastructure changes to help scale the ETL process.
- Define and design data retention policies leveraging data stores such as Vertica, Hive or Redshift.
- Setup ETL jobs using workflow systems including Azkaban and Oozie.
- Educate team on new technologies and best practices in interacting with data platform.
- Own the entire ETL process to ingest data and help design and support different data products with real time data.
- Assist as required on data extraction for Data Scientists and other internal or external parties.
- Bachelor’s (or educ. equiv.) Degree in Computer Science or Information Technology and five (5) yrs. (post-degree, progressive) experience in Job Offered. Alternatively, will accept Master’s (or educ. equiv.) Degree in Computer Science or Information Technology and three (3) yrs. experience in Job Offered.
- At least 2 yrs. experience must have included:
- working with batch-processing and tools in Hadoop technology stack including MapReduce, Pig, Hiveand HDFS;
- building realtime systems with Storm or Spark data transformation pipelines;
- large data store experience, including HBase, HDFS, Vertica, Redshift; data modeling and performance tuning;
- experience with message queues/brokers such as Kestrel, Kinesis or Kafka and
- working with workflow systems including Azkaban and Oozie.