Location: Barcelona, New York, New Jersey, London
Department: Technology
About the RoleDow Jones is seeking a skilled Data Scientist to join our AI Engineering Team. You will design, construct, and maintain machine learning pipelines tailored for various AI applications, particularly focusing on Natural Language Processing. Collaborating with our engineering team, you'll integrate machine learning models, optimize performance, and ensure effective real-world application of machine learning solutions.
As a key team member, you will be involved in every aspect of the data science project development process, from data analysis and model selection to building proofs of concept and refining pipelines. You will leverage your expertise in statistical analysis and machine learning techniques to derive insights from data, address the organization's needs, and deliver tangible value through actionable outcomes.
Responsibilities:Collaborate within the AI Engineering Team to maintain robust data pipelines supporting various ML models, focusing on information retrieval applications.Analyze and clean large datasets to optimize reusable ML models.Partner with stakeholders across the company to translate business requirements into technical solutions.Utilize analytical skills for NLP modeling, algorithm selection, and POC development.Develop data enrichment pipelines to enhance insights aligned with strategic objectives.Collaborate with cross-functional teams to address the organization's data-driven needs, managing significant volumes of structured and unstructured data.Integrate diverse ML models into systems, ensuring interoperability and performance optimization.Lead efforts to optimize ML model performance through data analysis and validation in real-world applications.Stay updated on AI, ML, and NLP advancements, incorporating emerging trends into processes.Mentor junior team members, fostering collaboration and professional development.Qualifications:Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related STEM field.At least 5 years of industrial experience in a data science or machine learning engineering role.Strong programming skills in Python and/or another high-level language commonly used in machine learning.Experience with NLP and Machine Learning frameworks and libraries (e.g., PyTorch, HuggingFace, LangChain, spaCy, NLTK, scikit-learn, etc.).Demonstrated understanding of various techniques for extracting structured data from unstructured sources, indicating expertise in information retrieval.Familiarity with LLMs APIs for pre-processing, fine-tuning, and deploying models on cloud-based infrastructure.Familiarity with cloud-based infrastructure and services (e.g., AWS, GCP, etc.).Experience with containerization and orchestration technologies, such as Docker and Kubernetes, on cloud-based infrastructure.Experience with version control systems such as Git.Strong communication skills and teamwork passion for impactful outcomes in document analysis, classification, summarization, translation, personalization, and chatbots.
#J-18808-Ljbffr