Thesis work:30hp- Industrial Knowledge Graphs for OT Defence: From Network Flows to Auto-remediation
Smart Factory Lab
Background
We are formalising IEC 62443 zones and conduits and need faster, fact-based ways to check segmentation, spot policy drift, and assess the blast radius when something breaks or a CVE drops. Today this is manual and fragmented across IPAM, configs, flow logs, firmware lists, and work orders. A knowledge graph (KG) can join assets, zones, conduits, observed flows, firmware/SBOMs, vulnerabilities, owners, and work orders into one model. What industry lacks is a thorough, measured comparison between a KG-based approach and current practice on a real assembly area.
Read more about IEC 62443: https://gca.isa.org/blog/how-to-define-zones-and-conduits
Assignment
Begin by surveying and securing read-only access to the relevant data sources to assemble a representative data set. Model the core entities and relationships, then build a robust, repeatable pipeline to ingest and validate the data. Develop a clean interface to explore the graph and findings, with the necessary integrations (e.g., identity/ownership and CMMS for ticketing). Finally, run a side-by-side comparison against the current workflow, measuring time-to-answer, accuracy, and operational impact.
You’ll have a supervisor, regular check-ins, and access to mentors and our lab resources. The scope is open to your ideas.
Key questions:
- Does a KG with zones/conduits + SHACL detect segmentation/policy violations with higher precision/recall and lower time-to-answer than current practice on the pilot area?
- Does KG-based blast-radius and single-point-of-failure analysis reduce the effort and time to produce an accurate impact list for an asset failure or isolation?
- When SBOM/CVE data are linked to the plant KG, is the time from CVE publication to an owner-tagged, prioritised remediation list significantly shorter than today?
- What is the smallest useful ontology/rule set that achieves ≥80% precision on violation detection, and how do data-quality gaps (missing zone/owner/firmware) affect results?
- Does closing the loop (automatic CMMS tickets from KG findings) shorten remediation lead-time and ticket first-time-right rates without increasing noise?
- How reliable are results to partial topology/flow data, and what confidence annotations are needed for safe decision-making?
(SBOM: Software Bill of Material; CVE: Common Vulnerabilities and Exposures; CMMS: Computerised Maintenance Management System; SHACL: Shapes Constraint Language)
Education and time plan
Education: Master's program in Computer Science, Data Science. If you don’t meet every point, we still encourage you to apply.
Number of students: 2
Start date: January 2026
Estimated time needed: 20 weeks
Topics: Software Engineering & Architecture, Knowledge Graphs, Data Pipelines
Contact persons and supervisors:
Maarten van Ittersum will be the supervisor and will be able to answer questions on the project
tel. +46706160581 , email:maarten.van.ittersum@scania.com
Application
Your application should include a CV, cover letter and transcripts.
A background check may be conducted for this position. We conduct interviews on an ongoing basis and may close recruitment earlier than the stated date.
Södertälje, SE, 151 38