Telco Customer Churn Analysis

Association:

N/A

Duration:

7 days

Data Analytics

SQL

Tableau

To practice SQL, I use IBM's sample data module tracking a fictional telco company's customer churn based on various factors and proceed through MySQL Workbench (local host).

I start off by creating a separate schema and six (6) separate tables to store the data and import the .csv files into MySQL.

-- TABLE FOR CUSTOMER INFORMATION
CREATE TABLE ci (
    customer_id VARCHAR(50) PRIMARY KEY,
    gender VARCHAR(10),
    age INTEGER,
    under_30 CHAR(3),
    senior_citizen CHAR(3),
    partner CHAR(3),
    dependents CHAR(3),
    number_of_dependents INTEGER,
    married CHAR(3)
);

-- TABLE FOR CUSTOMER LOCATION
CREATE TABLE l (
    customer_id VARCHAR(50) PRIMARY KEY,
    country VARCHAR(50),
    state VARCHAR(50),
    city VARCHAR(50),
    zip_code VARCHAR(5),
    total_population INTEGER,
    latitude REAL,
    longitude REAL
);

-- TABLE FOR ONLINE SERVICES
CREATE TABLE os (
    customer_id VARCHAR(50) PRIMARY KEY,
    phone_service CHAR(3),
    internet_service CHAR(3),
    online_security CHAR(3),
    online_backup CHAR(3),
    device_protection CHAR(3),
    premium_tech_support CHAR(3),
    streaming_tv CHAR(3),
    streaming_movies CHAR(3),
    streaming_music CHAR(3),
    internet_type VARCHAR(50)
);

-- TABLE FOR PAYMENT INFO
CREATE TABLE p (
    customer_id VARCHAR(50) PRIMARY KEY,
    contract VARCHAR(50),
    paperless_billing CHAR(3),
    payment_method VARCHAR(50),
    monthly_charges DOUBLE,
    avg_monthly_long_distance_charges DOUBLE,
    total_charges DOUBLE,
    total_refunds DOUBLE,
    total_extra_data_charges DOUBLE,
    total_long_distance_charges DOUBLE,
    total_revenue DOUBLE
);

-- TABLE FOR SERVICE OPTIONS
CREATE TABLE so (
    customer_id VARCHAR(50) PRIMARY KEY,
    tenure INTEGER,
    internet_service CHAR(3),
    phone_service CHAR(3),
    multiple_lines CHAR(3),
    avg_monthly_gb_download INTEGER,
    unlimited_data CHAR(3),
    offer VARCHAR(50),
    referred_a_friend CHAR(3),
    number_of_referrals INTEGER
);

-- TABLE FOR CHURN STATUS
CREATE TABLE sa (
    customer_id VARCHAR(50) PRIMARY KEY,
    satisfaction_score INTEGER,
    cltv INTEGER,
    customer_status VARCHAR(10),
    churn_score INTEGER,
    churn_label CHAR(3),
    churn_value TINYINT,
    churn_category VARCHAR(50),
    churn_reason TEXT
)

Exploratory Data Analysis (EDA)

The EDA includes geographic analysis, service analysis, retention segmentation, risk analysis, and in-depth churn analysis, with the end goal of finding patterns that lead to higher churn possibilities.

-- Top Localities with Highest Customer Churn

SELECT 
    l.city,
    COUNT(DISTINCT ci.customer_id) AS total_customers,
    ROUND(AVG(p.monthly_charges), 2) AS avg_monthly_charges,
    COUNT(DISTINCT CASE WHEN sa.customer_status = 'Churned' THEN ci.customer_id END) AS churned_customers,
    ROUND(COUNT(DISTINCT CASE WHEN sa.customer_status = 'Churned' THEN ci.customer_id END) * 100.0 / 
        COUNT(DISTINCT ci.customer_id), 2) AS churn_rate
FROM l
JOIN ci ON l.customer_id = ci.customer_id
JOIN p ON ci.customer_id = p.customer_id
JOIN sa ON ci.customer_id = sa.customer_id
GROUP BY l.city
HAVING COUNT(DISTINCT ci.customer_id) > 100

As this dataset includes only California residents, it is intuitive to see these cities as the top market. However, it's noteworthy that there is a significant difference between the churn rate of San Diego compared to other locations.

-- Service Package Comparison based on Popularity and Retention

SELECT 
    so.internet_service,
    so.phone_service,
    COUNT(*) as customer_count,
    ROUND(AVG(p.monthly_charges), 2) as avg_charges,
    ROUND(AVG(sa.satisfaction_score), 2) as avg_satisfaction,
    ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_rate,
    ROUND(AVG(sa.churn_score), 2) as avg_churn_score,
    RANK() OVER (ORDER BY COUNT(*) DESC) as popularity_rank,
    RANK() OVER (ORDER BY ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2)) as retention_rank
FROM so
JOIN p ON so.customer_id = p.customer_id
JOIN sa ON so.customer_id = sa.customer_id
GROUP BY 
    so.internet_service, 
    so.phone_service
ORDER BY 
    customer_count DESC

Customers subscribing to both internet and phone services make up the biggest proportion of their customer base and possess the highest churn risk at the same time. On the contrary, phone service subscribers only have the lowest churn rate. It seems like there are more problems with their internet offerings than with the phone offerings.

-- Customer Segmentation based on Tenure and Gender

SELECT 
    tenure_group,
    ci.gender,
    COUNT(*) AS total_customers,
    COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) AS churned_customers,
    ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) AS churn_rate,
    ROUND(AVG(p.monthly_charges), 2) AS avg_monthly_charges
FROM (
    SELECT 
        CASE 
            WHEN tenure <= 12 THEN '0-12 months'
            WHEN tenure <= 24 THEN '13-24 months'
            ELSE 'Over 24 months'
        END AS tenure_group,
        customer_id
    FROM so
) tenure_data
JOIN ci ON tenure_data.customer_id = ci.customer_id
JOIN sa ON ci.customer_id = sa.customer_id
JOIN p ON ci.customer_id = p.customer_id
GROUP BY tenure_group, ci.gender
ORDER BY churn_rate DESC

It is not surprising to see new customers exiting the company a lot more than loyal customers. After using the company's services for more than two years, the possibility of service cancellation is significantly lower. There is not much difference between the two genders in this aspect.

-- Flaging High-risk Customers based on Retention and Bill patterns

SELECT 
    tenure_category.tenure_group,
    charge_category.charge_group,
    COUNT(*) as total_customers,
    COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) as churned_customers,
    ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_rate,
    CASE 
        WHEN ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) > 
            (SELECT AVG(subq.churn_pct) + STDDEV(subq.churn_pct)
             FROM (
                SELECT 
                    ROUND(COUNT(CASE WHEN sa2.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_pct
                FROM so so2
                JOIN sa sa2 ON so2.customer_id = sa2.customer_id
                GROUP BY 
                    CASE 
                        WHEN so2.tenure <= 6 THEN '0-6'
                        WHEN so2.tenure <= 12 THEN '7-12'
                        WHEN so2.tenure <= 24 THEN '13-24'
                        ELSE 'Over 24'
                    END
             ) subq
            ) THEN 'High Risk'
        WHEN ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) > 
            (SELECT AVG(subq.churn_pct)
             FROM (
                SELECT 
                    ROUND(COUNT(CASE WHEN sa2.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_pct
                FROM so so2
                JOIN sa sa2 ON so2.customer_id = sa2.customer_id
                GROUP BY 
                    CASE 
                        WHEN so2.tenure <= 6 THEN '0-6'
                        WHEN so2.tenure <= 12 THEN '7-12'
                        WHEN so2.tenure <= 24 THEN '13-24'
                        ELSE 'Over 24'
                    END
             ) subq
            ) THEN 'Medium Risk'
        ELSE 'Low Risk'
    END as risk_level
FROM so
JOIN sa ON so.customer_id = sa.customer_id
JOIN p ON so.customer_id = p.customer_id,
LATERAL (
    SELECT 
    CASE 
        WHEN so.tenure <= 6 THEN '0-6'
        WHEN so.tenure <= 12 THEN '7-12'
        WHEN so.tenure <= 24 THEN '13-24'
        ELSE 'Over 24'
    END as tenure_group
) tenure_category,
LATERAL (
    SELECT
    CASE 
        WHEN p.monthly_charges <= 50 THEN '0-50'
        WHEN p.monthly_charges <= 70 THEN '51-70'
        WHEN p.monthly_charges <= 90 THEN '71-90'
        ELSE 'Over 90'
    END as charge_group
) charge_category
GROUP BY 
    tenure_category.tenure_group,
    charge_category.charge_group
ORDER BY churn_rate DESC

The risk analysis combining both tenure and bill patterns reveals that the highest risk groups include newly joined customers and customers with higher-than-normal service usage (accordingly, their monthly charges are higher). Once more, tenure is confirmed to play an important role in predicting the churn rate; meanwhile, people with intensive usage might look for different options to lower their monthly bills.

-- Customer Demographics and Churn

SELECT 
    ci.gender,
    ci.under_30,
    ci.senior_citizen,
    ci.dependents,
    ci.married,
    ROUND(AVG(sa.churn_score), 2) AS avg_churn_score
FROM ci
JOIN sa ON ci.customer_id = sa.customer_id
GROUP BY ci.gender, ci.under_30, ci.senior_citizen, ci.dependents, ci.married
ORDER BY avg_churn_score DESC

Looking deeper at the demographic factors suggests that older people (30 years old and above) might be harder to retain as they seem to have higher average churn scores.

-- Satisfaction Score Analysis

SELECT 
    sa.satisfaction_score,
    COUNT(*) as customer_count,
    ROUND(AVG(p.monthly_charges), 2) as avg_charges,
    ROUND(COUNT(CASE WHEN sa.customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_rate,
    ROUND(AVG(cltv), 2) as avg_lifetime_value
FROM sa
JOIN p ON sa.customer_id = p.customer_id
GROUP BY sa.satisfaction_score
ORDER BY sa.satisfaction_score DESC

-- CLTV Segments Analysis

SELECT MIN(cltv), ROUND(AVG(cltv)), MAX(cltv)
FROM sa; 

/* Min: 2003, Mean: 4400, Max: 6500 */

SELECT 
    CASE 
        WHEN cltv BETWEEN 2000 AND 3000 THEN 'Low Value'
        WHEN cltv BETWEEN 3001 AND 5500 THEN 'Medium Value'
        ELSE 'High Value'
    END as customer_segment,
    COUNT(*) as customer_count,
    ROUND(AVG(satisfaction_score), 2) as avg_satisfaction,
    ROUND(AVG(churn_score), 2) as avg_churn_score,
    ROUND(COUNT(CASE WHEN customer_status = 'Churned' THEN 1 END) * 100.0 / COUNT(*), 2) as churn_rate
FROM sa
GROUP BY 
    CASE 
        WHEN cltv BETWEEN 2000 AND 3000 THEN 'Low Value'
        WHEN cltv BETWEEN 3001 AND 5500 THEN 'Medium Value'
        ELSE 'High Value'
    END
ORDER BY avg_churn_score DESC

-- Customer Status and Satisfaction Correlation

SELECT 
    customer_status,
    ROUND(AVG(satisfaction_score), 2) as avg_satisfaction,
    ROUND(AVG(cltv), 2) as avg_lifetime_value,
    ROUND(AVG(churn_score), 2) as avg_churn_score,
    COUNT(*) as customer_count
FROM sa
GROUP BY customer_status
ORDER BY customer_status, customer_count DESC

These analyses confirm the fact that unsatisfied customers will look elsewhere. Thus, customer satisfaction is critical to business success in general and customer retention in particular as happy customers are willing to pay more and stay longer.

-- Leading Churn Category and Reason

SELECT 
    churn_category,
    churn_reason,
    COUNT(*) as customer_count,
    ROUND(AVG(satisfaction_score), 2) as avg_satisfaction,
    ROUND(AVG(cltv), 2) as avg_lifetime_value,
    ROUND(AVG(churn_score), 2) as avg_churn_score
FROM sa
WHERE customer_status = 'Churned'
GROUP BY churn_category, churn_reason
ORDER BY customer_count DESC

This highlights the biggest areas for improvement within their operation. Telecom is a competitive industry as the switching cost tends to be low while the entry barrier is medium. Thus, customers can easily shop around and compare alternatives in real-time and make decisions within days or weeks. In this scenario, customers complain about the price, service quality and coverage, and customer support; as a result, they decide to switch carriers and providers offering better pricing, promotion, and quality.

Suggestions

In order for the company to enhance their customer churn, they should focus on improving customer satisfaction through:

  • Improving their technical capability to provide better speed and reliability

  • Provide training for customer service and sales staff for better interactions

  • Upgrading their customer-facing interface with more intuitive UI and smoother experience

  • Reducing their cost of operation to lower the price charged to customers without hurting the company's profit

It would be practical for the company to target customers in the mid-range of tenure (~ one year) or who are moderately satisfied with its offering (having a satisfaction score of 3) and try to solve the problems directly to convert them into happy and loyal customers, rather than acquiring new customers or retaining extremely dissatisfied customers.

Let's Talk

Let's Talk

Let's Talk

© 2025. All rights Reserved.

© 2025. All rights Reserved.

© 2025. All rights Reserved.