We are looking for a savvy Data Engineer with technical backgrounds in data system and platform development. The hire will be responsible for expanding and optimizing our data and data pipeline infrastructure, as well as optimizing data flow and collection for cross functional teams. If you want to join a world-class development team of LINE, we look forward to hearing from you soon!
1. Develop a next generation data processing and analytics system for Taiwan market and integrate with global platform.
2. Work with TW project teams to identify, design, and implement internal data process improvements; automating manual processes, optimizing data delivery.
3. Build analytics tool that utilize the data pipeline to provide insights to customer needs, operational efficiency and other key business performance metrics.
4. Assemble large, complex data sets that meet functional / non-functional business requirements.
5. Work closely with our development teams at Tokyo and Seoul.
6. As a member of New Initiatives Task Force, you might also:
- Create APIs and services that allow third-parties to integrate with and utilize LINE platform.
- Design and build core, backend software components for Messaging, Social Graph and Partnership platform.
Required Skills and Qualifications
1. At least 3 years of hands-on software development experience with Java, Scala or Python.
2. B.S. or M.S. Computer Science or related fields.
3. Experience building distributed service and of handling big data preferred.
4. Fluent in English.
5. Experience with big data tools or stream-processing systems: Hadoop, MapReduce, Zookeeper, HDFS, HBase, Hive, Spark, Kafka, Storm, Spark-streaming, etc..
6. Experience in custom ETL design, implementation and maintenance on Hadoop clusters.
7. Experience with multi-threaded programming and debugging.
8. Proficiency in Linux and shell scripting.
9. Good understanding of distributed system, basic mathematics such as statistics and probability.
Preferred Skills and Qualifications
1. Experience building and optimizing "big data" data pipelines, architectures and data sets.
2. Analytic skills related to working with unstructured datasets.
3. Technical capacity to understand and implement at-least-once / exactly-once delivery guarantee in distributed data pipeline.
4. Strong SQL skills, especially in the area of data aggregation.
5. Experience with troubleshooting/tuning JVM GC.
6. Experience with Maven and GIT.
7. Experience with A/B testing environment.