Signify are working with a large Communications company in Europe on an exciting new role which will focus on designing more efficient and valuable data pipelines.
The task: As a Scala Developer, you will be responsible for designing, developing, and maintaining data pipelines that will provide valuable insights for the company.
By adopting a DevOps approach, you will ensure that the system is always running smoothly by automating tasks and focusing on creating new features rather than deployment.
You will also be responsible for testing and monitoring the system using appropriate methods and tools.
Core requirements: 3+ years of programming in Scala with Apache SparkStrong knowledge of ETL processesDevOps knowledge and experience (2 years minimum)Prior use of Hadoop, and HDFS for large file storage (must be able to work with autonomously)Your responsibilities will include: Developing data architecturesContributing to the short, mid, and long term vision of the systemExtracting, transforming, and loading data from large and complex data setsEnsuring that data is easily accessible and performs well in scalable environmentsParticipating in the planning and architecture of the big data platform to optimize performanceBuilding large data warehouses for further reporting or advanced analyticsCollaborating with machine learning engineers to implement and deploy solutionsEnsuring robust CI/CD processes are in placeThe tech stack at a glance: Scala, Spark, Hadoop, Databricks, Kafka, AWS, and more.
To be successful in this role, you should have strong knowledge and experience with Scala and Spark.
You should also have experience with SQL and NoSQL databases, and be familiar with CI/CD concepts.
In addition, you should have technical knowledge in data pipeline management, workflow management (such as Oozie or Airflow), and large file storage (such as HDFS, Data Lake, S3, or Blob storage).
Experience with stream processing technologies (such as Kafka, Kinesis, or Elasticsearch) and a cloud environment (such as Hadoop, Cloudera, EMR, or Databricks) is a plus.
#J-18808-Ljbffr