The iTunes Store Big Data engineering team is looking for talented mid- and senior level engineers to build and enhance features and technology infrastructure driving the iTunes Store, App Store, Apple Music and iBookstore. Our team is responsible for much of iTunes’ big data infrastructure, as well as designing and delivering key systems powering Apple Music, iTunes Charts and many other personalized and cloud features of the iTunes ecosystem.
Watch: Career Advice This is your opportunity to help build global-scale, leading-edge big data systems, with positions available in San Francisco, London and Cupertino. Responsibilities: Collaborate with application engineers, data scientists, and analysts across iTunes Engineering to design and implement realtime and offline data pipelines, storage infrastructure, and schemas. Develop processes and tools to integrate and distill data from various sources for use in powering datacentric features including popularity metrics, search, and recommendations. Work with other data engineers to provide a unified set of data for internal and external consumption by teams across engineering teams and beyond.
Key Qualifications
Proficiency in Scala and Java required
Deep understanding of ObjectOriented Programming, software architecture, design patterns, and software development best practices
Experience with Spark and Hadoop Platform (Hadoop, Map-Reduce, HDFS) Comfortable with Linux command line tools and basic shell scripting At least 5 years of relevant software development experience
At least 2 years of relevant Big Data experience
Description
Our work covers the full stack from iTunes’ internet-facing services (public HTTP services), internal services used by customer features (internal RPC APIs); design and implementation of data pipelines/lifecycles; Hadoop infrastructure, strategy and implementation; distributed key-value storage (Voldemort, Cassandra, HBase, etc); and putting all this together to operate live customer-facing features with millisecond-latencies across multiple data centers with petabyte datasets and > 500 million users. As a result, we have opportunities available for a variety of related specializations within big data and server engineering. Whether you’re an all round performance-savvy Java server engineer, a Kafka expert, a Spark expert, you live and breathe MapReduce, or you’re the go-to person for designing effective data schemas for big data usage patterns — you could make a big splash here.
Education
Bachelors in Computer Science, Information Systems, Engineering or equivalent, Masters Preferred
Additional Requirements
Experience building and/or using distributed systems, distributed caching, distributed key-value or column stores (e.g. Cassandra, Voldemort, HBase) A strong understanding of eventual consistency concepts Experience with and understanding of the Hadoop-ecosystem technologies such as Spark/Shark, Storm, Impala, YARN/MR2, Hive, Pig, Cascading, Apache Crunch, M/R Streaming, or other Big Data technologies Experience building and running large scale data pipelines, including distributed messaging such as Kafka, data ingest to/from multiple sources, to feed batch compute components from HDFS and near-real-time components from key-value storage (like Lambda architecture) Experience and interest in data modeling and data architecture as optimized for big data patterns (warehousing concepts; efficient storage and query on HDFS; support for relevant real-time query patterns in key-value stores; columnar schema design; etc.) Experience with Hadoop administration for MapR or Cloudera distributions Excellent understanding of scheduling and workflow frameworks and principles Experience with unit testing and data quality test automation Previous experience with developing APIs for high-throughput systems Experience with Python, shell scripting and/or other scripting languages Experienced with Subversion and Git
Send To A Friend