NPIC Senior Data Engineer (XN08)

Leeds Teaching Hospitals NHS Trust
£50,952 to £57,349 a year
Closing date
25 Sep 2023

View more

Other Health Profession
Band 8A
Full Time
The National Pathology Imaging Co-operative (NPIC) programme, a �100m+ initiative supported by central funding from NHS England, Innovate UK and the Office for Life Sciences. NPIC has the ambition to provide a national platform to support pathology services.

The ambitions of the programme include:

- Driving clinical use of digital pathology nationally and providing a platform to support this

- Designing, building and maintaining a platform for the development of artificial intelligence and the evaluation of the AI lifecycle, and

- Providing a platform for further research & innovation and maximising the platform to facilitate this.

We are seeking to appoint an experienced individual who is driven, ambitious and passionate about improving patient care through the use of cutting-edge technology. The post-holder will be responsible for building, testing, maintaining data architecture and undertaking new developments while maintaining existing data models and reporting relating to the Deployment of the Digital Pathology national footprint.

The role is crucial in laying the foundation, enabling data scientists and data analysts to create new insights from data. The Senior Data Engineer will be responsible for design, implementation and delivery of platform enhancements, upstream data lineage, and data acquisition and wrangling. The role will work in a data rich environment together with data analysts and data scientists.

Main duties of the job

The Senior data engineer acts as an expert on all data engineering-related projects and queries, leading on related areas on our projects across the programme. They oversee the data function within NPIC, formulating and implementing long term plans to develop and improve our service offer. They directly manage the data team, overseeing their development as data engineers, as well as supporting the rest of the team under a matrix model.

The lead data engineer is the lead person for designing and implementing methodologies to extract, transform, clean and wrangle data. They develop testable, reusable code in SQL and Python (desirable) to a high-quality standard. They are a strong communicator, helping clients understand complex processes, IG issues and limitations of the data. They are keen trouble-shooters, diagnosing data quality issues across numerous large and complex healthcare data and designing innovative solutions to meet our clients needs

About us

Leeds Teaching Hospitals Trust has embarked upon an ambitious Digital programme with a key objective to improve its service to patients. This programme is underpinned by Information Technology and the delivery of a range of digital solutions. The Trusts commitment to this programme includes the implementation of a number of internally developed and externally procured Digital solutions. It is necessary to introduce new, or expand existing, Digital solutions/systems into complex clinical environments which often span multiple departmental and organisational boundaries within the Region.

The National Pathology Imaging Co-operative (NPIC) programme, a �100m+ initiative supported by central funding from NHS England, Innovate UK and the Office for Life Sciences. NPIC has the ambition to provide a national platform to support pathology services.

Leeds Teaching Hospitals is committed to our process of redeploying 'at risk' members of our existing workforce to new roles. As such, all our job adverts are subject to this policy and we reserve the right to close, delay or remove adverts while this process is completed. If you do experience a delay in the shortlisting stage of the recruitment cycle, please bear with us while this process is completed, and contact the named contact if you have any questions.

Job description

Job responsibilities

Data engineering:

Provide a specialist data engineering service to NPIC and its partners.

Produce flexible, programmatic SQL code that can be re-used in multiple settings.

Use techniques to ensure efficient queries on large datasets in environments where you may have limited control, including batch processing, splitting of code into smaller sections, transactions, and indexing.

Extract data from SQL in a variety of ways depending on the environment - command line / python / PowerShell / SSIS / scripting.

Develop efficient and easily maintainable processes to handle highly complex, large-scale NHS datasets.

Ensure highly complex data process tasks are fully automated and scheduled to run out of hours wherever practically possible.

Identify and advise on inefficient existing queries and propose appropriate changes.

Build test environments based on dataset definitions to allow for local coding & testing on databases.

Work in a multitude of trusted research environments to produce extracts of data.

Rapidly understand and evaluate different databases from their structure, documentation, and contents.

Produce work using source control, peer review and open feedback.

Understand how consistency of capture and recording can affect ability to report & analyse data.

Maintain detailed, specialist knowledge and expertise in NHS information systems, National Tariff and hold an understanding of other information systems.

Review and develop processes and applications as technology and techniques emerge, guaranteeing systems and processes remain current and fit for purpose.

Maintain the security of IT systems and the confidentiality of personal and commercially sensitive data at all times in line with the relevant information governance and technical security standards and policies of each database the NPIC Programme uses.

Understand and work with publicly available heath data to provide context to projects.

Introduce, reinforce and monitor modern development practices across the service, including all aspects of continuous integration, and making use of virtual infrastructure as appropriate.

Follow and engage with local developer communities to identify how new technologies are being adopted in both public and private sectors.

Work with team members to collaboratively code to deliver the best possible results.

Plan, develop and evaluate methods and processes for gathering, extracting, transforming and cleaning data and information.

Work with multiple different relational database systems (for example, Oracle, DB2, SQL, Vertica).

Develop comprehensive documentation to accompany all products and analyses

Own methodologies to extract data for use in population health analytics, activity planning, machine learning and economic evaluations, and regularly review approaches against industry standards.

Oversee the development of data transformation routines from both complex technical specifications and natural language agreements.

Liaise with external system suppliers during the implementation of new solutions and to ensure systems are appropriately maintained.

Maintain control of the technologies used within the data engineering function, and maintain a balance of those with consideration of affordability, suitability, training, and ease of recruitment.

Make necessary technical, governance and political arrangements to allow data extractions from a diverse set of databases.

Maintain confidentiality and discretion at all times

Leadership of the data engineering function:

Deputise for the Programme Manager and maintain oversight over data engineering aspects within the team.

Be the lead in promoting good coding practices such as running unit, functional, and integration tests to ensure quality.

Be self-sufficient and further the reputation of NPIC as a leader in robust data engineering approaches.

Quality assure queries for the entire NPIC Programme.

Make recommendations, provide advice and prepare strategic reports to senior management and relevant management groups.

Advise on innovative opportunities and support all NPIC teams in their strategies and programmes to maximise the use of data.

Project and client management:

Discover, gather, document and vet business and functional requirements for new projects and new datasets used in those projects.

Ensure stakeholder representation is engaged throughout the project cycle.

Identify products, equipment, services, and facilities for assigned activities, achieving stakeholder buy-in as required. Placing orders, keeping mindful of budget limitations.

Service development:

Identify opportunities for the development of services and propose and oversee any such changes, some of which may impact upon other areas or functions.

Drive development and improvement of processes.

Be a leading contributor to a culture where data governance is respected and strictly adhered to within the team.

Line management and training:

Provide effective leadership, training, support, and generate enthusiasm and motivation in all team members to ensure that they are appropriately empowered to carry out the responsibilities of their role.

Line manage a team of data engineers, data analysts, including recruitment decisions, conducting annual appraisals, one to one meetings, objective setting, identifying training and development requirements, handling disciplinary issues and staff performance, and managing staff absence.

Manage other members of the team through a matrix model.

Work closely with the Lead Analyst and Econometrician in delivering the projects and responsibilities of the Data and Analytics teams of the NPIC.

Provide technical mentoring to other members of the team to maximise their abilities and deliver highly complex data extractions for machine learning, population health analyses and health economic evaluations.

Contribute to training and development across the NPIC Programme

Instil a culture of learning and knowledge share throughout the data and analytics teams to ensure the service is fully resilient.

Proactively identify opportunities to improve system-wide performance using data.

Ensure that the service is perceived externally as both modern and efficient, by providing horizon scanning, demonstrations, and critique of newer techniques.

In depth analysis, interpretation and production of complex and multiple reports including financial returns.

Support other project managers as and when required.

Lead the delivery of project plans, allocating tasks as appropriate, identifying risks, issues and dependencies, considering best practice and current options and ultimately making decisions in the best interest of the projects.

Be responsible for a high standard of work supporting the delivery of projects on time, to high quality standards and in a cost-effective manner. Maintain project initiation documents and associated plans with regular team meetings to monitor progress and resources.

Ensure the flexibility of projects if required to meet conflicting/changing requirements.

Plan and organise numerous events/meetings, ensuring communication tools are used to their maximum value for circulating the minutes, agenda and presentations in a timely manner.

Ensure that projects maintain business focus and have clear authority and that the context, including risks, is actively managed in alignment with the strategic priorities of the NPIC Programme.

Financial responsibilities:

Create and monitor project and programme budgets, ensuring that resources are used efficiently and any deviation to the budget is communicated internally and to clients

Act in a way that is compliant with Standing Orders and Standing Financial Instructions in the discharge of these responsibilities.

Manage selection and spending on third party products and contractors within budget to secure ROI and be able to demonstrate benefits and value for money.

Provide specialised knowledge on resource, skill and capacity requirements for all new work that will expand the scale or complexity of applications, analyses or data sets managed by the service.

Manage capacity and resource in existing infrastructure, and work with data and analytics colleagues to ensure that the service is gaining the maximum benefit from that resource.

Person Specification



  • Significant experience of extracting data, manipulating, understanding, transforming, wrangling and cleaning NHS datasets.
  • Working knowledge and experience with data engineering frameworks and toolkits.
  • Experience working in a big data environment
  • Working knowledge and experience of project management methodologies, including Agile methodologies Expert SQL programmer


  • Good knowledge of Python for data engineering purposes

Skills & Behaviours


  • Experience with ICD, SNOMED, Read codes and the NHS Data Dictionary.
  • Ability to write well-designed, testable, efficient SQL code which follows good coding standards
  • Good time management, working with often tight and conflicting deadlines
  • Strong communication skills to successfully discuss methodologies with all different types of audiences
  • Fluent in SQL-based systems like MySQL, PostgreSQL Microsoft SQL Serve
    Any attachments will be accessible after you click to apply.


Get job alerts

Create a job alert and receive personalised job recommendations straight to your inbox.

Create alert