Job Reference: 605_24_CASE_WP_RE1
Position: Research Engineer - ML-Aided Mathematical Optimization (MAMO) - (RE1)
Closing Date: Monday, 30 September, 2024
About BSC: The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and is the hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.
Context And Mission: In this topic, a pure binary integer combinatorial optimization problem (ILP) is used to solve a fundamental question in Natural Language Processing (NLP): how to best compare and match the semantic contents of two natural language texts or "textual summaries" A and B, based on a given measure of the semantic similarity between their unit semantic constituents (USCs). The student intern will focus on how to reduce the computational cost incurred when resolving pure binary integer linear programming (ILP) problems. The approach will consist of porting existing Python code to a high performance computing node partition, possibly using the PyCOMPSs framework for orchestration in distributed compute environments, and reducing the set of feasible ILP solutions by means of reinforced learning.
Key Duties:
Become familiar with several concepts: NLP, the measure of semantic similarity, the extraction of unit semantic constituents from unstructured text, and the ILP formulation. The student will also establish a baseline workflow with Python, based on a conventional ILP solver applied to our joint set-packing and set-partitioning problem, using one or more FOSS ILP Python packages, and following the already formulated constraints of the ILP problem.Design a semantically informed, state-of-the-art Machine Learning method to reduce the set of feasible solutions for the initial semantic pairing ILP problem.Learn a generalizable model to infer acceptably reduced sets of feasible solutions for never-seen-before textual summaries, establishing suitable performance metrics.Relax the pure binary integer constraint on decision variables, and will inspect obtained semantic similarity solutions from a natural (human) language standpoint.Prepare a situation and final report to be delivered on the last day of this short term contract, including a structured bibliography, description of methodologies used, developed code and results, and prospective ideas for setting up an RL algorithm.Requirements:
Education: Relevant scientific project experience gained through work experience or recent academic courses in computer science, mathematical analysis, and optimization schemes.Essential Knowledge and Professional Experience: Collaborative and version controlled work environments (e.g., gitlab, github, etc.), Python 3, C/C++, Unix/Linux and related scripting shell, differential calculus, vector analysis, combinatorial optimization, convexity in optimization problems.Additional Knowledge and Professional Experience: Fluent written and spoken English, knowledge of distributed computing, parallelization paradigms, and COMPSs framework.
#J-18808-Ljbffr