Amazon Development Center Romania is seeking a passionate Data Engineer with strong analytical, technical and communication skills to address increasingly complex business questions within the Robot Detection team. We’re processing big data-sets and real-time streams of website traffic in order to classify various types of robots crawling Amazon’s customer-facing websites. By eliminating robot traffic from Amazon’s business metrics, we facilitate accurate data-driven decisions and forecasts (. site-wide/category conversion rates, product stock vs demand, scaling predictions, etc). Mitigation actions based on the robots that we detect lead to multi-million hardware cost-savings for website rendering, and allow protecting competitive business data. We actively track and drive to zero the rate at which humans are misclassified as robots and strive to provide the best experience to both customers and robots.
Amazon has a culture of data-driven decision-making and demands business intelligence that is timely, accurate, and actionable. This team provides a fast-paced environment where every day brings new challenges and new opportunities. You will help solve complex robot investigations and isolate robot incidents from data-quality issues to ensure the accuracy and precision of our systems. You will reverse engineer robot business models to find potential unmet business needs of external Amazon customers and sellers that the robots might be offering.
· 3+ years of professional experience in business analytics, data engineering or comparable consumer analyst position handling large, complex data sets.
· Bachelor's degree in Math, Finance, Statistics, Engineering, or related discipline.
· Strong SAS/R, SQL and Excel expertise to access and transform data into insights.
· Comfortable working in a Unix/Linux environment with experience in a scripting language for managing large data sets.
· Strong troubleshooting and problem solving skills.
· Ability to quickly adapt to changing priorities and generate innovative solutions in an extremely fast-paced environment.
· Experience in working with business customers to drive requirements analysis
· Excellent written / oral communication and interpersonal skills
· Experience with Big Data solutions: Spark, Hadoop, Pig, Hive or other frameworks
· In-depth understanding of HTTP, TCP and other web-related protocols is a plus
· Demonstrated ability to frame complex analytical problems, pull data, and extract insights that led to tangible results (revenue, seller launches, new product features, etc.).
· Strong organizational and multitasking skills with the ability to balance competing priorities.
· Experience partnering with business owners directly to understand their requirements and provide data which can help them observe patterns and spot anomalies.
· Outstanding speaking, writing, and presentation skills, as well as the ability to persuade, inspire, and motivate others. Summarize key insights of complex solutions for technical and non-technical audiences (colleagues from computer science, machine learning and business backgrounds, as well as senior management decision-makers).
As a data engineer you will be working in one of the world's largest and most complex data warehouse environments. You should be passionate about working with huge data sets, and be someone who loves to bring datasets together to answer business questions. You should have deep expertise in creation and management of datasets. You will be working closely with retail category teams, machine-learning scientists, statisticians, software engineers, and various business groups.
The solutions you’ll be developing provide some of the unique challenges of space, size and speed. You will implement data analytics using cutting edge analytics patterns and technologies that are inclusive of but not limited to Spark and Redshift. You will extract huge volumes of data from various sources and message streams and construct complex analyses. You will write scalable queries and tune performance on queries running over billions of rows of data. You will define and track key metrics to avoid regressions.