From Two Networks to One: What Large-Scale RAN Integration Really Breaks First

LTE · 5G NSA · RAN Integration · 8 min read

Large network integrations don't fail where people expect them to. Capacity is rarely the first problem. Coverage isn't either.

What breaks first is assumption alignment.

The biggest technical challenge was not spectrum reuse or site consolidation. It was reconciling how two nationwide RANs interpreted the same user behavior differently. On paper, both networks were healthy. KPIs looked reasonable in isolation. Once traffic began shifting at scale, the mismatches surfaced quickly.

Where the mismatches appeared

None of these were red alarms. They showed up as soft degradation — retries, increased setup times, edge failures. The kind of problems customers feel before dashboards turn red.

First failure classes to surface at scale
Mobility at inter-network boundaries: Handover decisions tuned within each RAN independently Inter-network boundary: neither RAN's assumptions held Measurement reporting thresholds misaligned across the boundary Result: late or failed handovers at transition zones, not visible in either network's standalone KPI UE behavior under mixed timer sets: T3412 / T3324 timer values differed between networks Devices switching between networks encountered inconsistent idle-mode behavior: paging gaps, registration delays Aggregate impact: elevated setup times, not attributed to integration NSA anchoring under load: LTE anchor selection logic differed between vendor implementations Under load, anchor cell selection produced inconsistent NR availability Devices on "good coverage" cells still failed NR establishment because scheduler assumptions for NSA split bearer differed
Why site-level readiness checks missed this

Integration validation at this point was largely site-centric: is the site commissioned, are the parameters set, does the cell pass acceptance checks. Each site passed. The interactions between sites at the network boundary were never in scope.

Check performed What it confirmed What it missed Site acceptance Cell reachable, KPIs within target at low load Behavior at boundary under sustained traffic from migrating devices Parameter audit Parameters match template per network Interaction between adjacent cells from different networks with different baselines Standalone KPI review Each network individually within target Cross-network mobility path failure rate, not visible in either KPI set alone Lab / controlled test Feature functions under modeled device behavior Real device population behavior differing from lab model at scale
What data sources exposed the patterns

The shift was from site-level checks to flow-level analysis. Engineering data records, call traces, and user-plane telemetry correlated across both networks was the only way to see what was actually happening to devices as they moved between and across the two RANs.

Data sources and what each contributed: Engineering Data Records (EDRs): Per-device session history across both networks Anchor transitions, bearer events, mobility sequences Identified which device types and mobility patterns produced failures at inter-network boundaries Call traces (Uu / X2 / S1): Timer interactions at boundary handovers RAN-core signaling timing mismatches VoLTE bearer re-establishment sequences post-HO Confirmed RAN-core interaction failures not visible in radio KPIs alone User-plane telemetry: Throughput and latency per session across network segments Identified "good coverage" cells with scheduler assumption mismatch causing data session degradation despite strong RF Combined view: Certain mobility failures only at specific anchor/secondary transitions VoLTE issues traced to RAN-core timing, not RF conditions Scheduler divergence quantified per vendor pair
Fig 1 — Integration validation: site view vs flow view
SITE VIEW Network A site: pass Network B site: pass boundary not in scope integration issues: not visible FLOW VIEW Network A device path Network B EDR + trace + user-plane boundary behavior visible failure patterns identified

Integration is not a radio problem. It is a system-behavior problem. The only way to see it is through data that follows the user end-to-end — not data that describes the network site by site.

The question that reframed the validation approach: not "is the site ready" but "does this configuration behave predictably when millions of devices behave differently than the lab models assumed." That is a harder question. It requires data at a different layer. It also prevents the class of post-launch problems that site-level checks structurally cannot see.

That mindset carried forward into 5G Standalone readiness, nationwide KPI normalization, and analytics platforms built to track behavior rather than counters. Whenever the scope of a network change was large enough that no individual team could hold the full system context simultaneously, the answer was always the same: follow the device, correlate the layers, and trust the data over the model.

LTE  ·  5G NSA  ·  RAN Integration  ·  Mobility  ·  OSS Analytics  ·  Performance Engineering  ·  Network Scale  ·  Telecommunications

Popular posts from this blog