Drizly Drizly

  • Identify and design data requirements and access patterns for big data initiatives.
  • Identify and integrate data processing and reporting frameworks using Hadoop and MapReduce.
  • Implement near real-time ETL process to handle hundreds of data sources using Spark and Kafka technology.
  • Monitor performance and advise of any necessary infrastructure changes to help scale the ETL process.
  • Define and design data retention policies leveraging data stores such as Vertica, Hive or Redshift.
  • Setup ETL jobs using workflow systems including Azkaban and Oozie.
  • Educate team on new technologies and best practices in interacting with data platform.
  • Own the entire ETL process to ingest data and help design and support different data products with real time data.
  • Assist as required on data extraction for Data Scientists and other internal or external parties.

Requires:

  • Bachelor’s (or educ. equiv.) Degree in Computer Science or Information Technology and five (5) yrs. (post-degree, progressive) experience in Job Offered.  Alternatively, will accept Master’s (or educ. equiv.) Degree in Computer Science or Information Technology and three (3) yrs. experience in Job Offered.
  • At least 2 yrs. experience must have included:
    • working with batch-processing and tools in Hadoop technology stack including MapReduce, Pig, Hiveand HDFS;
    • building realtime systems with Storm or Spark data transformation pipelines;
    • large data store experience, including HBase, HDFS, Vertica, Redshift; data modeling and performance tuning;
    • experience with message queues/brokers such as Kestrel, Kinesis or Kafka and
    • working with workflow systems including Azkaban and Oozie.

Not for you? Help instead by spreading the word: