Company Description

At CERN, the European Organization for Nuclear Research, physicists and engineers are probing the fundamental structure of the universe. Using the world's largest and most complex scientific instruments, they study the basic constituents of matter - fundamental particles that are made to collide together at close to the speed of light. The process gives physicists clues about how particles interact, and provides insights into the fundamental laws of nature. Find out more on http://home.cern.

Job Description

Join the team operating and evolving the 5’000 node HTCondor based High-Throughput batch service at the heart of the CERN data processing!

As a HTC Services Engineer in the Compute and Devices (IT-CD) group in charge of the HTC batch system you will contribute to the efficient provisioning of compute resources to CERN users and experiments. Specifically, you will

  • Ensure compute service delivery for the two main batch service use cases: local data processing and the Worldwide LHC Computing Grid Tier-0;
  • In close collaborations with the experiments, contribute to the scientific compute evolution strategy which aims to re-architect and co-develop a set of supportable solutions for the future, e.g. in the areas of “Analysis Facilities” and “Interactive Compute”.

    As a member of the Compute and Configuration (IT-CD-CC) section you will:

    • Participate in the ongoing service automation, monitoring and scaling activities for the scientific computing and interactive services;
    • Provide technical expertise and consulting to users of the section's services.

      Functions

      As a HTC Services Engineer, you'll manage the HTC batch system, handling administration, operation, maintenance, and supporting physicists with diverse software needs. Your role spans the entire system life cycle, from user requirements to infrastructure planning and software development.

      In particular you will:

    • Be responsible for running the HTCondor based batch system at CERN, ensuring appropriate operations and support to our communities;
  • Contribute to the design and development of “Infrastructure as Code” components to further automate compute resource lifecycle, preventative and corrective maintenance, and component upgrades;
Engage and collaborate with upstream developers and external peer institutes;Provide user support, including analysing user requirements, providing advice on best practices, understanding usage patterns in order to plan for service improvements;Help develop and design service evolution to grow the service along with the requirements of the LHC as it moves towards “High Luminosity”, and as user workflow expectations evolve.

Qualifications

Master's degree or equivalent relevant experience in the field of computer science or a related field.

Experience:

We are looking for someone with an interest in operating and optimizing large-scale, mission-critical production services with the following experience/skills:

Knowledge of system administration, in particular Linux environments,Knowledge of configuration management tools such as Puppet or Ansible, and monitoring of distributed systems,Programming techniques and languages, in particular Python or Go programming,Dealing with user relations, user support, and user requirements definition,Familiarity with Agile/Scrum methodologies and DevOps practices.

Additional experience/skills in the following areas would be an asset:

Experience with High-Throughput Computing (HTC) workload management systems, such as HTCondor.

Technical competencies:

Knowledge of operating systemsKnowledge of system configuration toolsArchitecture and design of ICT systemsKnowledge and application of software life-cycle tools and proceduresMonitoring and responding to security threats and incidents for ICT systems

Behavioural competencies:

Working in Teams: working well in groups and readily fitting into a team; participating fully and taking an active role in team activities Cooperating constructively with others in the pursuit of team goals; balancing personal goals with team goals.Solving Problems: addressing complex problems by breaking them down into manageable components Recognizing what is essential; discriminating between important and peripheral information and being able to see the whole picture Testing solutions for long-term suitability, cross-checking with all concerned before implementation.Managing Self: taking initiative beyond regular tasks and making things happen Working well autonomously; taking on activities and tasks without promptingBuilding Relationships: showing appreciation for the ideas and contributions of others and encourages others to express their views, even if controversial Being able to put self in the shoes of others in order to understand their needs and interests

Language skills:

Spoken and written English: ability to understand and speak the language in professional contexts. Ability to draw-up technical specifications and/or scientific reports and to make oral presentations.

Additional Information

Eligibility and closing date:

Diversity has been an integral part of CERN's mission since its foundation and is an established value of the Organization. Employing a diverse workforce is central to our success. We welcome applications from all Member States and Associate Member States.

This vacancy will be filled as soon as possible, and applications should normally reach us no later than 19.01.2024.

Employment Conditions

Contract type: Limited duration contract (5 years). Subject to certain conditions, holders of limited-duration contracts may apply for an indefinite position.

These functions require:

Work during nights, Sundays and official holidays, when required by the needs of the Organization.Stand-by duty, when required by the needs of the Organization.

Job grade: 6-7

Job reference: IT-CD-CC-2023-172-LD

Benchmark Job Title: Computing Engineer

Recommended for you