The Data Engineering team at GitHub is looking for a savvy Data Engineer to join our growing team. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support our software developers, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives. If you have a passion for data and GitHub we'd love to talk to you.

Responsibilities:

Create and maintain tools supporting self-service data pipeline management (ETL)

Evolve data model and data schema to meet functional / non-functional business requirements

Monitor data quality and consistency

Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources

Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.

Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Qualifications:

You have experience building and optimizing data pipelines, architectures, and data sets.

You can build processes supporting data transformation, data structures, metadata, dependency and workload management.

You have a successful history of manipulating, processing and extracting value from large disconnected datasets.

You have experience developing on Git and GitHub.

We are looking for a candidate with 8+ years of experience in a Data Engineer role, who is educated in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools: Experience with Hadoop (or similar): Presto, Hive, Parquet, Spark, etc. Experience with data pipeline and workflow management tools: Airflow etc. Proficient in SQL Experience with scripting languages (e.g. Python)



Who We Are:

GitHub is the developer company. We make it easier for developers to be developers: to work together, to solve challenging problems, and to create the world’s most important technologies. We foster a collaborative community that can come together—as individuals and in teams—to create the future of software and make a difference in the world.

Leadership Principles:

Customer Obsessed - Trust by Default - Ship to Learn - Own the Outcome - Growth Mindset - Global Product, Global Team - Anything is Possible - Practice Kindness

Why You Should Join:

At GitHub, we constantly strive to create an environment that allows our employees (Hubbers) to do the best work of their lives. We've designed one of the coolest workspaces in San Francisco (HQ), where many Hubbers work, snack, and create daily. The rest of our Hubbers work remotely around the globe. Check out an updated list of where we can hire here: https://github.com/about/careers/remote

We are also committed to keeping Hubbers healthy, motivated, focused and creative. We've designed our top-notch benefits program with these goals in mind. In a nutshell, we've built a place where we truly love working, we think you will too.

GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!

Please note that benefits vary by country. If you have any questions, please don't hesitate to ask your Talent Partner.

#LI-POST