Hands-on infrastructure for Linux, cloud, and sovereign AI.
Over fifteen years operating production Linux, AWS, and enterprise networks. The same engineering discipline applied to private AI deployment — open-weight models running on dedicated infrastructure inside your environment, with predictable costs and full data residency. Available for project work and fractional infrastructure leadership.
/portrait.jpg
The simplest system that solves the problem — and nothing more.
The right tool, not the loudest one
Kubernetes is excellent — when the workload genuinely needs it. Most don’t. I match the stack to the problem, the team, and the budget, instead of defaulting to whatever is fashionable. Complexity should be earned.
Cost as a design constraint
Cloud bills, licence sprawl, and underused capacity are usually symptoms of architectural drift. I treat cost reduction as an engineering problem — measure first, redesign second.
Fewer moving parts
Every service you don’t run is a service that can’t fail at 3am. I consolidate, retire, and replace before I add. Your future on-call self will thank you.
You can’t fix what you can’t see
Useful monitoring and logging are non-negotiable. The right metrics and logs turn a guessing game into engineering — surfacing real bottlenecks, catching problems before users do, and showing where the next improvement actually lives.
Operate, don’t just deploy
Go-live is the start, not the finish. I think about backups that have actually been restored, alerts that wake people for real reasons, and runbooks someone other than the original engineer can follow.
Where I work in the stack.
Architecture & Audits
Independent reviews of existing infrastructure, simplification roadmaps, and fractional infrastructure leadership for teams without a senior in-house engineer.
Linux & Systems
The foundation of everything I do. Production Linux at scale, deep performance tuning, systemd and kernel-level troubleshooting, automation, hardened baselines, and migrations away from legacy stacks. Fifteen+ years of running real systems in real conditions.
Cloud & On-Premise
Comfortable across the full spectrum — from racked hardware in your own server room to large-scale public cloud. Deep AWS experience, fluent with any major hyperscaler, and clear-eyed about when each model actually makes sense for the workload.
Networks
Office, datacentre, and cloud networking. VLAN design, firewall architecture, site-to-site interconnects, VPC topology in AWS, and edge delivery via CDN — from a small office segment to a multi-country backbone.
Endpoint Management
Modern device and identity management for distributed teams. MDM/MAM on Microsoft Intune across the M365 stack, Google Workspace configuration and hardening, conditional access policies, and self-service onboarding for laptops in regions without on-site IT.
IT Security
Infrastructure audits, hardening, secret management, network segmentation, and pragmatic security baselines for SMEs and regulated sectors.
PostgreSQL & Elasticsearch
Production data infrastructure at scale. Schema review, performance tuning, and replication — with particular focus on backup and disaster recovery that is actually recoverable: point-in-time restore, off-site retention, and tested recovery procedures rather than untested assumptions.
Monitoring & Logging
Metrics, logs, and alerts that actually earn their keep — designed to surface real bottlenecks, catch issues early, and show where the next improvement lives.
Self-hosted AI
End-to-end help with running your own LLMs: choosing the right model for the job, deployment and GPU sizing, fine-tuning where it pays off, vectorisation and retrieval pipelines, and prompting workflows that produce reliable output instead of party tricks.
Applications & Prototypes
Hands-on delivery for clients who need a working system, not a slide deck — from rapid prototypes that validate an idea to production-ready applications. System architecture first: clean service boundaries, sensible data flow, and a deployment story that holds up under real load.
Problems solved, generalised.
Internal agentic RAG platform on open-weight LLMs
An enterprise needed a private knowledge assistant for employees without exposing corporate data to external APIs. Designed and built an agentic retrieval-augmented system on Nemotron-class open-weight models, hosted entirely on dedicated company GPUs. Lightweight Node.js chat interface with minimal third-party dependencies, automated ingestion and vectorisation from SharePoint and other corporate sources, and Qdrant for similarity search and document retrieval.
Production LLM inference across a 50+ GPU fleet
Selecting the right open-weight model per use case and tuning runtime configuration to match real workload characteristics — prefill-heavy and decode-heavy paths each need different parameters. Built a custom inference layer on the TensorRT-LLM SDK to gain the flexibility and observability the platform required, now serving production traffic across more than fifty GPUs.
300+ server international Linux fleet under code-managed configuration
Operated a multinational Linux fleet of more than three hundred servers — maintenance, scaling, hardening, and ongoing audit posture. Configuration fully managed in Ansible so every host stays auditable and traceable. Distributed monitoring built on Icinga master plus satellite nodes alongside Prometheus and Grafana, paired with operational workflows for alerts, incidents, and escalation so issues are addressed before users notice.
Disaster recovery for 25 TB+ PostgreSQL and Elasticsearch clusters
Designed and operated disaster-recovery procedures for PostgreSQL clusters carrying more than 25 TB per node, alongside large Elasticsearch clusters. Reviewed schemas and configuration profiles for performance, implemented point-in-time recovery via pgBackRest, and added off-site archival so the recovery story is actually recoverable under real failure conditions.
Self-healing video and audio streaming platform at 500 TB / month
Built and operated a backend handling more than 500 TB of streamed media every month. Architecture based on ffmpeg and tsduck with a dedicated recording service plus management and operator interfaces designed for 24/7 use. Every stream is transcribed through self-hosted speech-to-text inference and the resulting transcripts are stored and searchable in Elasticsearch — no audio or text ever leaves the platform. Strong emphasis on automated, self-healing remediation so the platform absorbs routine failures without paging anyone.
50-server datacentre footprint retired through coordinated cloud migration
Coordinated the decommissioning of multiple legacy systems alongside a planned migration to the cloud, retiring more than fifty physical servers and closing rack space across data centres. The result reduced ongoing infrastructure cost while simultaneously raising availability and service quality — the rare migration that improves both columns.
Legacy domain controllers retired for cloud-managed endpoints
An organisation was running identity and device management on aging on-premise domain controllers with VPN-heavy remote access. Migrated to a fully cloud-managed MDM/MAM stack on Microsoft Intune and Microsoft 365, enabling remote work without complex VPN topologies, self-service laptop onboarding for regions with no on-site IT helpdesk, and central endpoint observability for quick audit and compliance response.
Global enterprise network across 10+ countries
Designed and operated office and inter-site networking for a multi-country footprint — Ubiquiti switching and Wi-Fi at the edge, Cisco in the core, pfSense firewalls, and site-to-site interconnects on IPsec and WireGuard. Includes corporate Wi-Fi rollouts and the operational practices needed to keep them stable across time zones.
Zero-trust developer access where VPN no longer fits
Traditional VPN remains the right tool in some scenarios — many others are better served by purpose-built alternatives. Designed bespoke developer access on WireGuard for cases that genuinely needed it, and replaced legacy remote-access VPN with zero-trust networking on OpenZiti where the threat model and operational overhead favoured it.
Fifteen+ years of operating systems.
Independent Infrastructure Consultant
Audits, architecture reviews, fractional infrastructure leadership, and sovereign AI deployments for teams in Malta and worldwide.
Director of Infrastructure
Leading the international IT infrastructure team for a four-company group across more than ten countries. AWS and on-premise estate, Microsoft 365 for 600+ users alongside Google Workspace for 300+, MDM/MAM on Intune, plus the security, ISMS, and disaster-recovery programmes that sit on top. Reports to the CTO.
Technical Support Team Leader
Led technical support at a regulated payment service provider — payment-gateway rollouts, SEPA connector integration, and direct REST/API integration support.
Technical Engineer — SCADA & IT
Full-stack IT for energy-grid SCADA systems. Functional team lead for SCADA Systemtechnik (Westnetz Süd) from 2017, owning IT security, audits, ISMS, and Linux hardening. Earlier years focused on Python and Bash automation, Ansible-based configuration management, and large-scale Linux server administration.
Education & Foundational Training
BSc Energieinformatik (Computer Science and Energy) at Wilhelm Büchner Hochschule, Darmstadt (2013–2017). Preceded by a three-year apprenticeship as IT systems technician at Westnetz (2009–2012). ITIL 4 Foundation and Certified Scrum Master, both 2020.
Common questions.
Where are you based and where do you work?
Based in Malta. Fully remote across the EU and worldwide. On-site visits in Malta available; for clients elsewhere I travel by arrangement. Most engagements run mainly remote with occasional travel for kick-off and key milestones.
What does “sovereign AI” mean in practice?
Open-weight large language models (Llama, Nemotron, Qwen, Mistral) running on dedicated GPUs inside the client’s environment — typically vLLM or TensorRT-LLM serving, an embedding model, a vector database such as Qdrant or Elasticsearch, and ingestion pipelines from corporate sources. No prompts, documents, or embeddings ever leave the network. Predictable cost, full data residency, full audit trail.
Do you only work with large enterprises?
No. SMEs and regulated mid-market are the sweet spot. A common engagement is fractional infrastructure leadership for teams that don’t yet have a senior infrastructure engineer in-house. Audits and architecture reviews work well as a starting point regardless of company size.
How do you charge?
Day rate or fixed-scope project for one-off work, monthly retainer for fractional or advisory engagements. Billing happens via a Maltese VAT-registered entity. Day rates are quoted on request after a short scoping call so the number reflects the work.
What size GPU fleets have you operated?
Production inference across 50+ GPUs. Comfortable advising from single-node deployments (1× H100 / L40S) up to multi-node clusters with TensorRT-LLM or vLLM. The harder problem is usually choosing the right model and tuning the runtime per workload, not raw cluster size.
Which cloud providers do you work with?
Deep AWS experience, fluent across Azure, GCP, Hetzner, OVH, and on-prem. Cloud-agnostic by preference — the right answer depends on the workload, the data, and the team. I help teams evaluate honestly rather than defaulting to whichever provider is fashionable.
Do you take on short engagements?
Yes. Week-long architecture audits and one-off infrastructure reviews are common starting points and often the most useful first step. They give an independent picture of where the system stands and what to do next, with no obligation to continue.
Which languages do you work in?
English as working language, German as native speaker. Documentation and reports are normally produced in English; client conversations in either.
Have an infrastructure problem worth simplifying?
Drop me a short email about what you’re working on. If it sounds like a fit, we’ll set up a call from there to dig into the details.
karsten@tidewind.io