Thesis work:30hp- Industrial Knowledge Graphs for OT Defence: From Network Flows to Auto-remediation

Smart Factory Lab

Background

We are formalising IEC 62443 zones and conduits and need faster, fact-based ways to check segmentation, spot policy drift, and assess the blast radius when something breaks or a CVE drops. Today this is manual and fragmented across IPAM, configs, flow logs, firmware lists, and work orders. A knowledge graph (KG) can join assets, zones, conduits, observed flows, firmware/SBOMs, vulnerabilities, owners, and work orders into one model. What industry lacks is a thorough, measured comparison between a KG-based approach and current practice on a real assembly area.

Read more about IEC 62443: https://gca.isa.org/blog/how-to-define-zones-and-conduits

Assignment

Begin by surveying and securing read-only access to the relevant data sources to assemble a representative data set. Model the core entities and relationships, then build a robust, repeatable pipeline to ingest and validate the data. Develop a clean interface to explore the graph and findings, with the necessary integrations (e.g., identity/ownership and CMMS for ticketing). Finally, run a side-by-side comparison against the current workflow, measuring time-to-answer, accuracy, and operational impact.

You’ll have a supervisor, regular check-ins, and access to mentors and our lab resources. The scope is open to your ideas.

Key questions:

Does a KG with zones/conduits + SHACL detect segmentation/policy violations with higher precision/recall and lower time-to-answer than current practice on the pilot area?
Does KG-based blast-radius and single-point-of-failure analysis reduce the effort and time to produce an accurate impact list for an asset failure or isolation?
When SBOM/CVE data are linked to the plant KG, is the time from CVE publication to an owner-tagged, prioritised remediation list significantly shorter than today?
What is the smallest useful ontology/rule set that achieves ≥80% precision on violation detection, and how do data-quality gaps (missing zone/owner/firmware) affect results?
Does closing the loop (automatic CMMS tickets from KG findings) shorten remediation lead-time and ticket first-time-right rates without increasing noise?
How reliable are results to partial topology/flow data, and what confidence annotations are needed for safe decision-making?

(SBOM: Software Bill of Material; CVE: Common Vulnerabilities and Exposures; CMMS: Computerised Maintenance Management System; SHACL: Shapes Constraint Language)

Education and time plan

Education: Master's program in Computer Science, Data Science. If you don’t meet every point, we still encourage you to apply.

Number of students: 2

Start date: January 2026

Estimated time needed: 20 weeks

Topics: Software Engineering & Architecture, Knowledge Graphs, Data Pipelines

Contact persons and supervisors:
Maarten van Ittersum will be the supervisor and will be able to answer questions on the project

tel. +46706160581 , email:maarten.van.ittersum@scania.com

Application

Your application should include a CV, cover letter and transcripts.

A background check may be conducted for this position. We conduct interviews on an ongoing basis and may close recruitment earlier than the stated date.

Requisition ID: 21547

Number of Openings: 1.0

Part-time / Full-time: Full-time

Permanent / Temporary: Temporary

Country/Region: SE

Location(s):

Södertälje, SE, 151 38

Required Travel: 0%

Workplace: Hybrid