The Problem
The Solution
Slides

Network Availability and Performance

A large cellular carrier wants it's network to deliver better coverage and performance. Cell site configuration is currently performed manually by RF engineers. These engineers prioritize network failures and the deployment of new cell sites. There are not enough resources to continually optimize the entire network carrier as conditions change. Changes in subscriber patterns (new office, festivals, sporting event, holidays, etc), construction creating new obstructions, time of year (changing foliage) were regularly impacting network. Some areas with strong signals performed poorly due to issues with antenna tilt, or interference from adjacent cell sites (co-channel interference, ping-pong handoff, etc). With multiple technologies (2G, 3G, LTE) deployed across 3 frequency bands automated configuration was needed.

Self Organizing Networks (SON)

3GPP, the wireless standards organization has created 2 sets of SON standards:
  • D-SON Distributed Self Organizing Networks: D-SON is implemented at the edge of the network. Running on the Element Management System EMS (a network element of the evolved packet core) manages configures and optimizes the network elements in a cluster. All the network elements in the cluster managed by D-SON are from the same vendor and typically support a single technology (for example LTE).
  • C-SON is a higher level solution that manages all the networks across a large geographic region. For example a C-SON deployment might interface with a Samsung LTE network, a Ericsson GSM network and Alcetel small cells. This allows C-SON to take in to optimize across all the networks in a region, for example adjusting hand-off parameters to shift some traffic off of a congested LTE cell site to a collocated HSPA (3G) cell site.
  • Since C-SON manages networks from multiple vendors, many implementations are from smaller startups rather than the established Network Equipment providers.
C-SON operates by periodically reading the performance and configuration data from the network elements within the region being optimized and using this data to identify configuration changes that would improve network coverage and performance. These configuration changes are then pushed out to the Element Management Systems which control the cell sites. This process is repeated as often as every 15 minutes. To reduce the risk of degrading the network the amount of change during a single cycle is limited. Large changes occur incrementally over several cycles. At the start off each cycle performance is compared to the previous cycle, if the performance degraded rather than improved the change can be backed out. Repeated degradation raise an alert  so C-SON analytic model for that parameter can be reviewed.

How I got involved

The carrier reviewed several proposals selecting two for consideration. Because size and resources was a concern for one the companies down selected my company was asked to partner with them. We were asked to provide program management and staff augmentation for for deployment and ongoing operations. In reviewing the proposal and architecture of the C-SON vendor I identified several opportunities to customize the C-SON solution to the customers network, substantially improving the proposal.

Integration with existing network analytics

C-SON requires collecting a very large volume of data from each of Element Management Systems (EMS) as often as every 15 minutes. The data consists of configuration (small), performance (small) and Per Call Measurement Data (PCMD) (large). PCMD data consists of records created at every step of a voice or data transaction. These records document every aspect of each transaction with a handset, including: cell sector being used, signal quality, hand-offs, data headers, geo location, etc. There can be as much as 500GB of PCMD data for each 15 minute period. Because of the data volume C-SON servers paired with each EMS in each Regional Data Centers. The carrier we were working with had 20 regional data centers. Each center at at least one EMS for the 2G/3G network and one for the LTE network in the center.  The C-SON solution actually has 3 servers (application, database, console) servers per EMS with each sized according to the number of cell sites manged by the EMS. 

Customizing C-SON to the Customers Network

My company had recently installed an Analytics Platform (AP) at this cellular provider. The AP replaced several large point solutions supporting network operations, billing and customer care. It was designed to ingest data from a wide variety of sources using a stream process capable of filtering, reducing, aggregating and transforming data on the fly. Data from the streams process can be stored in a Data Warehouse supporting any number of business processes or forwarded to other applications. The AP is designed to be easily extended with new applications only requiring new data models and perhaps additional data sources. The AP reduced the time to value* for many applications from days to minutes.

Knowing PCMD data is being forwarded from every EMS throughout the carrier network to the AP I proposed an alternate architecture leveraging the AP. Rather than pulling data directly from the EMS systems a data model on the AP would access the PCMD data already available. Instances of the C-SON servers would run on a Private Cloud located in one of the Network Operations Centers (NOC). Moving to a private cloud eliminated individual C-SON servers paired with each EMS.

When possible providers deploy applications and the hardware they run on upstream at the NOCs where staff and full service data centers are located. They avoid deploying to Regional data centers which are lights out operations with minimal support. Applications in Regional Centers is harder and more expensive to support, especially when each server is uniquely configured. C-SON servers with large databases (and thus many hard drives ) are problematic.

Having instances running on a Private Cloud has many advantages and provides opportunities to enhance the C-SON solution. These include:
  • Dozens of  uniquely configured servers dispersed among 20 RDCs are eliminated
  • They are replaced by instances running on a single Private Cloud infrastructure in the NOC
  • The Private cloud is easier to manage and more economical
  • New C-SON instances are easily added and existing instances can be dynamically scaled to meet growing needs

Enhancing the C-SON solution

Leveraging the Analytics Platform provides opportunities beyond improved infrastructure. One function of the AP is to normalize the format of PCMD data. Every vendor's format for PCMD varies, the AP normalizes this data. Using normalized data from the AP makes it much easier for applications (including C-SON) to add new network hardware (for example a new small cell vendor). In addition PCMD data from the AP is enriched with additional data, the example the data plan of the customer. Having this data would allow C-SON to tailor customer experience, for example assigning more network resources to locations with many high value customers. For example adjusting antennas and  Mobile Load Balancing parameters to improve coverage at a sporting event attended by many customers with high value plans.

C-SON servers only have data for cell sites in their geographic region. Ideally C-SON would factor in data for cell sites along it's border but part of an adjacent region. The AP collects data from all regions. Adjusting the data model can add data from cell sites in adjoining regions improving optimization along the region boarders. C-SON can also be integrated with the network assurance system. For example a cell site failure alarm generates an outage event sent to the appropriate C-SON instance. C-SON takes this into account, attempting to heal the outage by shifting the coverage of adjacent cells. C-SON can also generate alerts sent to the network assurance application when it detects a network problem. Playbooks on the network assurance system can in turn generate problem tickets. This level of automation reduces the length and impact of network outages.

Without integration with the AP the C-SON servers for each region are an island. Network engineers remotely connect to each region to access reports and mange the C-SON solution. Integration with the AP allows performance data, configuration data and reports to be added to the data warehouse. Using this data and the AP functionality a national dashboard can be created. The national dashboard can provide detailed reports and interactive monitoring for the providers entire network, covering: network status, performance, congestion and network planning.

*  Time to Value - The interval of time from an event generating data occurring to the time the data is available in a usable form. For example the cause of a dropped call can be determined by applications that examine PCMD. The AP gathers PCMD data every 5 minutes and stores it in the data warehouse where it is immediately accessible to customer care applications. Previous to the AP PCMD data was processed overnight. In this case Time to Value was reduced from 24 hours to 5 minutes.

slides
Hover mouse over right or left edge of slides to advance