AI Infrastructure & Cloud Migration·NDA·2025

Swiss Healthcare AI Infrastructure Migration - Cloud to On-Premise

Four-stage AI infrastructure transformation for a Swiss private hospital, migrating medical diagnostics AI from AWS to on-premise infrastructure, processing 200,000 scans monthly while reducing costs by 95% and strengthening data security.

95%Cost Reduction

200KMonthly Scans

2-4sProcessing Time

$5.4M3-Year Savings

Deep Learning · Computer Vision · Medical AI · CNN · NVIDIA GPUs · PACS Integration · Infrastructure Design · Cloud Migration

Inside the engagement

This case study documents our completed AI infrastructure migration with a private hospital and diagnostic center in Switzerland. Over four stages spanning two years, we developed and deployed medical diagnostics AI on cloud infrastructure, scaled to production volumes, then migrated the entire system to on-premise infrastructure when economics and data residency requirements made cloud hosting inefficient. The system now processes 200,000 medical scans monthly, delivering faster results at a fraction of previous operating costs while strengthening data security and regulatory compliance.

The client operates a major private hospital and diagnostic imaging center in Switzerland, serving patients across multiple cantons. The facility conducts extensive diagnostic imaging including CT scans, MRI, X-ray, and ultrasound studies, generating large volumes of medical imaging data requiring radiologist review. With a shortage of specialized radiologists and increasing patient volumes, the facility needed AI assistance to prioritize urgent cases, flag potential abnormalities, and route studies to appropriate specialists efficiently.

The Swiss healthcare system requires strict data handling practices, and the facility's patient base includes individuals who specifically chose private healthcare partly for enhanced privacy protections. Any AI solution needed to meet HIPAA equivalent standards while handling sensitive medical data that patients expected would remain within Switzerland.

Implementation Overview

The project evolved through four distinct stages over two years, with each stage building on previous work while adapting to changing requirements and operational realities. What began as a cloud-based AI deployment ultimately became an on-premise solution when production economics and data residency priorities shifted.

Stage	Focus Area	Status	Key Capabilities
1	Model Development & Cloud Deployment	Completed	Multi-modality image analysis, abnormality detection, priority scoring, PACS integration, elastic cloud infrastructure
2	Production Scale & Analysis	Completed	200K monthly scans, cost analysis, optimization evaluation, TCO modeling, data residency assessment
3	Infrastructure Design	Completed	Hardware specification, redundancy planning, network optimization, disaster recovery design, team capability validation
4	Migration & Optimization	Completed	Parallel operation, seamless cutover, performance tuning, compliance documentation, monitoring implementation

Stage One: AI Model Development & Cloud Deployment

We started by developing deep learning models for medical image analysis, initially deploying on AWS infrastructure. Cloud hosting made sense at this stage because we needed computational flexibility during model training and didn't know what production usage patterns would look like.

We built convolutional neural network models trained on large datasets of annotated medical images. The models learned to identify anatomical structures, detect abnormalities, and flag findings that require urgent radiologist attention. For chest X-rays, the system recognizes pneumonia, lung masses, fractures, and other pathologies. For CT scans, it identifies tumors, hemorrhages, and structural abnormalities. MRI analysis focuses on soft tissue contrast and identifying lesions, inflammation, or degenerative changes.

The models output probability scores for various findings rather than definitive diagnoses. A chest X-ray might return 85% probability of consolidation suggesting pneumonia, 12% probability of a nodule, and 3% probability of pneumothorax. Radiologists use these assessments to prioritize their review queue and focus attention on studies most likely to contain significant findings.

We trained separate model architectures optimized for different imaging modalities because CT, MRI, and X-ray images have fundamentally different characteristics. The models learned not just to identify pathology but to account for normal anatomical variation, imaging artifacts, and previous surgical changes that could be mistaken for abnormalities.

The cloud deployment used GPU instances for inference, with auto-scaling to handle variable workloads throughout the day. Medical imaging volumes spike in morning hours when outpatient clinics schedule procedures, then taper in evenings. The elasticity of cloud infrastructure handled these patterns efficiently during the testing phase.

We integrated the AI system with the facility's PACS (Picture Archiving and Communication System) and radiology information system. When a scan completes, it automatically transfers to the AI pipeline for analysis. Results populate the radiologist's worklist with priority flags and preliminary findings to guide their review.

Stage Two: Production Scale and Cost Reality

After 18 months of operation, the system proved its clinical value and became essential to the facility's workflow. Radiologists relied on the AI prioritization to manage their workloads effectively. The system was processing 200,000 scans monthly with consistent, predictable volume.

That predictability revealed a problem. Cloud costs were $47,000 per month just for compute instances running the inference models. Total AWS spending including storage, data transfer, backups, and ancillary services reached $167,000 monthly, just over $2 million annually.

We ran a detailed financial analysis projecting three-year costs. Continuing on AWS would cost approximately $6 million. The compute costs were the largest component, but data egress charges for transferring large imaging files, storage costs for retaining data during processing, and backup services all contributed significantly.

We evaluated cloud optimization strategies including reserved capacity purchases, spot instances, and architectural changes to reduce costs. The facility already had reserved capacity commitments. We modeled aggressive rightsizing and optimization, but couldn't close the cost gap sufficiently. At 200,000 scans monthly with multi-gigabyte imaging files, the math didn't work.

The alternative was bringing the infrastructure in-house. We calculated capital expenditure for servers, GPUs, storage, networking equipment, and installation at approximately $340,000. Annual operating costs including power, cooling, maintenance contracts, and monitoring would run $95,000. Three-year total cost of ownership came to roughly $625,000, saving about $5.4 million compared to continued cloud hosting.

The facility owned their building and already maintained infrastructure teams managing their existing medical IT systems. They weren't building data center capability from scratch. The marginal cost of adding AI inference servers to their existing operations was reasonable.

Beyond cost, data residency became a stronger consideration. The facility's patients chose private healthcare partly for privacy. Having their medical images uploading to AWS data centers, even with encryption and compliance certifications, created discomfort among both patients and hospital administration. Keeping data physically on-premise meant scans never left the building except for explicitly authorized transfers to other medical facilities.

Stage Three: Infrastructure Planning and Migration Preparation

We designed the on-premise infrastructure to match cloud reliability while eliminating ongoing compute costs. The facility already had robust power systems including independent backup generators providing full redundancy. Their network included three different ISP connections with automatic failover. Physical security was handled by existing medical center protocols that already protected sensitive patient areas and medical equipment.

We specified server hardware with sufficient GPU capacity to handle current workloads with headroom for growth. The inference models run on NVIDIA GPUs selected for optimal performance on the specific neural network architectures we deployed. Storage systems provide fast access to imaging data during processing with sufficient capacity for the working dataset that needs immediate availability.

Redundancy happens at the hardware level with failover systems ensuring continuous availability. The facility maintains offsite backups for disaster recovery, stored at a separate medical facility they operate in another canton. This provides geographic separation without requiring cloud storage services.

We validated that the on-premise infrastructure could match or exceed cloud performance. Network latency actually improved because patients, imaging equipment, and processing servers all reside within the same facility. A scan taken in the radiology department travels over local network to servers in the facility's equipment room, processes there, and returns results to the radiologist's workstation without ever leaving the building. Response times dropped from 8-12 seconds for cloud processing to under 5 seconds for local processing.

The facility's existing infrastructure team could manage the additional servers without hiring new staff. These were experienced IT professionals already maintaining the facility's extensive medical IT systems including PACS, electronic health records, and medical equipment networks. Adding AI inference servers to their responsibilities was straightforward given their existing capabilities.

Stage Four: Migration Execution and Optimization

We executed the migration during a planned maintenance window, transitioning from AWS to on-premise infrastructure with minimal disruption. The approach involved running both systems in parallel temporarily, gradually shifting workload to the local servers while verifying performance and reliability.

We migrated the trained models from cloud storage to local servers, along with all configuration, processing pipelines, and integration code. The PACS integration adapted to send images to the local inference servers instead of uploading to AWS. We validated that all imaging modalities processed correctly and results returned to radiologists' workstations as expected.

The cutover happened over a weekend when imaging volumes were lowest. We monitored the system closely during the first week, ready to fall back to cloud processing if problems emerged. None did. The local infrastructure handled the full production workload reliably.

Post-migration optimization included tuning GPU utilization patterns, adjusting batch processing schedules, and refining storage management. We implemented monitoring that alerts infrastructure teams if processing times increase, GPU utilization drops unexpectedly, or any component shows signs of degradation that might require maintenance.

The compliance team documented the new architecture for regulatory audits. Because patient data no longer transmits outside the facility, several compliance controls simplified. The facility eliminated third-party data processing agreements with AWS. Security audits now examine physical security and local network controls rather than cloud configuration and data transmission protocols.

We updated the disaster recovery procedures to account for the on-premise architecture. The offsite backup system at the sister facility provides recovery capability if the primary site experiences serious problems. While not as geographically distributed as multi-region cloud deployment, the approach matches the facility's risk tolerance and regulatory requirements.

Results

The completed migration delivered the projected cost savings while improving system performance. The facility now spends approximately $95,000 annually for operations that previously cost $2 million, a reduction of over 95% in ongoing costs after recovering the initial $340,000 infrastructure investment through seven months of savings.

Processing latency improved significantly. Scans now analyze in 2-4 seconds compared to 8-12 seconds when round-tripping to AWS. Radiologists receive preliminary findings faster, improving their workflow efficiency. The latency improvement was unexpected, we anticipated matching cloud performance but not exceeding it. The elimination of internet upload/download time and processing within the local facility network created faster response than cloud infrastructure could deliver.

Data residency concerns resolved completely. Patient scans remain physically within the facility's building throughout the analysis process. The compliance team simplified their audit procedures because they no longer assess third-party data processing controls. Security reviews focus on physical facility security and local network protections that were already robust for protecting other medical systems.

The facility's security team appreciates eliminating the risk of data breaches at cloud providers. While AWS maintains strong security, removing that dependency eliminates an entire category of third-party risk. For medical imaging data that the facility retains for 30+ years, eliminating any external transmission removes concerns about future cryptographic vulnerabilities that might compromise currently-encrypted data decades from now.

System reliability has matched cloud availability. The redundant power systems, multiple ISP connections, and hardware redundancy provide the uptime radiologists require. The infrastructure team handles routine maintenance during off-hours without affecting clinical operations.

The facility discovered an unexpected benefit around data handling during patient transfers. When patients move to other medical facilities for specialized care, the imaging data and AI analysis results can be provided on physical media or through direct facility-to-facility secure transfer. This gives the facility more control over how data leaves their systems compared to managing permissions in cloud storage.

Financial predictability improved substantially. The facility now has fixed infrastructure costs rather than usage-based cloud billing that scaled with patient volumes. This makes budgeting simpler and eliminates concerns about unexpected cost increases if imaging volumes grow.

The infrastructure investment included capacity for growth beyond current volumes. The facility can scale to approximately 300,000 scans monthly before needing additional hardware. If they reach that point, adding another server costs far less than the equivalent cloud capacity increase would.

Radiologists report no degradation in AI system performance or availability. From their perspective, the system works identically to before, scans analyze automatically and results appear in their worklist with priority flags and preliminary findings. The infrastructure change happened transparently to clinical users.

The facility's administration views the project as a successful example of re-evaluating technology decisions as circumstances evolve. Their initial cloud deployment made perfect sense for getting the AI system operational quickly. Two years later at production scale with different priorities, bringing the infrastructure in-house became the logical choice.

The migration demonstrated that 'cloud-first' doesn't mean 'cloud-forever' for every workload. Infrastructure decisions should adapt to changing business realities, workload characteristics, and cost structures. For this facility with predictable high-volume processing of sensitive data and existing infrastructure capabilities, owning the hardware delivered better economics and stronger data protection than continued cloud hosting.

Building something like this?

If your team is one or two unknowns away from a system like this one, a thirty-minute call is the fastest way to find out.

Book a discovery call

Engagements range from two-week diagnostics to multi-month builds, scoped after a single discovery call.

What every case here has in common

Every project on this page shipped because we said no to the wrong scope before we said yes to the right one. Half the value of working with us is the engagement we will not take. The other half is the system that ends up running in your business.

Sebastian MondragonFounder, Particula Tech

Before you ask

Healthcare, defense-adjacent, and enterprise clients sign NDAs that prevent naming. Engagement scope, technology stack, and measured outcomes can be shared publicly. Client identity stays protected.

Related Projects

AI Consulting

AI Operations Audit for a Liquid-Bulk Forwarder

A two-week diagnostic mapping every workflow at a Riga-based flexitank and ISO-tank forwarder, produced a ranked roadmap of seven AI opportunities with measured baselines and projected impact.

AI Infrastructure & Local Deployment

Private AI Platform for 200-Employee Engineering Firm

Self-hosted AI platform running Qwen3 models on 4x NVIDIA L40S GPUs for a German engineering consultancy, replacing EUR 14K/month in cloud subscriptions with local chat, RAG, transcription, and code assistance.