All works till now.

This commit is contained in:
2026-03-09 13:51:03 +09:30
parent 315bafd6dc
commit d4e336d385
16 changed files with 594 additions and 38 deletions

360
README.md
View File

@@ -1,2 +1,362 @@
# birthdaymessaging
Overview
The Birthday Messaging Intelligence Platform is a data-driven engagement system designed to transform static birthday campaigns into an intelligent engagement engine.
The system integrates CRM data, behavioral interaction logs, and predictive analytics to automatically identify high-value engagement opportunities, detect churn risk, and orchestrate personalized birthday messaging campaigns.
The platform evolves through three stages:
Stage Description
Week 2 Data pipeline and engagement preparation
Week 3 Predictive intelligence and behavioral segmentation
Future Multi-channel automation and campaign optimization
The end result is a system that predicts disengagement before it happens and activates personalized engagement campaigns.
System Architecture
The project is organized into modular components following a data pipeline → intelligence → automation architecture.
CRM Data + Clickstream Logs
Unified Data Pipeline
Data Cleaning & Compliance
Engagement Intelligence
Churn Prediction
Predictive Segmentation
Automation Triggers
Project Structure
birthdaymessaging/
├── config.yaml
├── main.py
├── README.md
├── data/
│ ├── raw/
│ └── processed/
├── docs/
│ ├── ethics.md
│ └── model-audits/
├── src/
│ ├── pipeline/ # Week 2 Data Engineering
│ │ ├── unified_stream.py
│ │ ├── relative_date_trigger.py
│ │ ├── rf_scoring.py
│ │ ├── list_hygiene.py
│ │ ├── ethics_flags.py
│ │ ├── deliverability_check.py
│ │ └── config_loader.py
│ │
│ ├── models/ # Week 3 Intelligence Layer
│ │ ├── engagement_scoring.py
│ │ ├── churn_classifier.py
│ │ └── send_time_optimizer.py
│ │
│ ├── segmentation/
│ │ └── predictive_segments.py
│ │
│ └── automation/
│ ├── trigger_engine.py
│ └── reactivation_triggers.yaml
Week 2: Data Engineering Pipeline
Week 2 focuses on building a clean, compliant data foundation.
Unified Data Stream
File:
src/pipeline/unified_stream.py
Functions:
Load CRM dataset
Load clickstream activity logs
Pseudonymize user identifiers
Merge datasets
Normalize timestamps (UTC)
Output:
unified_dataset.csv
Birthday Recursion Trigger
File:
relative_date_trigger.py
Implements recurring campaign triggers such as:
Birthday approaching
Pre-birthday reminder
Post-birthday follow-up
Ensures contacts re-enter the workflow annually.
Recency-Frequency Scoring
File:
rf_scoring.py
Calculates engagement indicators:
Recency index
Frequency of sessions
Session change delta
Used as an early engagement signal.
List Hygiene Automation
File:
list_hygiene.py
Improves deliverability by removing:
Invalid email domains
Disposable email services
Role-based addresses
Ethics and Compliance Layer
File:
ethics_flags.py
Detects sensitive data usage:
medical_condition
political_opinion
dietary_restrictions
Ensures compliance with GDPR principles.
Deliverability Simulation
File:
deliverability_check.py
Simulates email infrastructure readiness:
SPF validation
DKIM compatibility
DMARC enforcement
inactive contact suppression
Week 3: Engagement Intelligence Engine
Week 3 converts the pipeline into a predictive analytics platform.
Engagement Scoring Engine
File:
src/models/engagement_scoring.py
Calculates a normalized engagement score (0100) based on:
Recency of interaction
Session frequency
Session duration
Click engagement depth
Formula:
Engagement Score =
(Recency Weight × Normalized Recency)
+ (Frequency Weight × Session Count)
+ (Engagement Depth Weight × Interaction Index)
Output classification:
Score Tier
75100 Highly Engaged
5074 Stable
2549 Declining
024 At Risk
Churn Risk Classification
File:
src/models/churn_classifier.py
Defines churn risk using behavioral signals.
Conditions:
No login ≥ X days
AND
Engagement score drop ≥ Y%
Risk categories:
Risk Level Description
Low Healthy engagement
Medium Engagement declining
High High probability of churn
Output fields:
churn_risk_flag
churn_probability_score
risk_class
Predictive Segmentation
File:
src/segmentation/predictive_segments.py
Creates behavioral marketing segments using:
Lifecycle Stage
Engagement Tier
Birthday Proximity
Churn Risk Overlay
Example segment:
At-Risk + Birthday Approaching
Send-Time Optimization
File:
src/models/send_time_optimizer.py
Analyzes historical open patterns and determines optimal send windows.
Outputs:
optimal_send_hour
optimal_send_day
Automation Triggers
File:
src/automation/reactivation_triggers.yaml
Defines campaign orchestration rules.
Example:
IF churn_risk_flag = True
AND birthday_within_30_days = True
THEN send reminder email
WAIT 24h
IF unopened → send SMS
Model Governance & Ethics
Documentation:
docs/model-audits/week3-bias-review.md
docs/ethics.md
Audits include:
Geographic bias
Age-based performance variance
False-positive churn classification
Transparency measures include clear documentation of automated decision logic.
Running the Pipeline
Run the full pipeline:
python main.py
Pipeline order:
Week 2:
Data Merge
Birthday Triggers
RF Scoring
List Hygiene
Ethics Checks
Deliverability Simulation
Week 3:
Engagement Scoring
Churn Classification
Predictive Segmentation
Send-Time Optimization
Final dataset:
data/processed/final_intelligence_output.csv
Success Criteria
The system is considered operational when:
Engagement scores generated for all active users
Churn risk classification validated
Predictive segments documented
Reactivation triggers tested
Send-time optimization operational
Ethics documentation updated
Bias audit completed
Outcome
At completion, the platform functions as an Engagement Intelligence Engine.
Capabilities include:
Predicting disengagement
Prioritizing high-value users
Automating reactivation campaigns
Personalizing birthday marketing
Ensuring ethical data usage
The platform transforms traditional birthday messaging into a behaviorally intelligent engagement strategy.

View File

@@ -2,3 +2,12 @@ crm_file: "sample_data/crm.csv"
clickstream_file: "sample_data/clickstream.csv"
output_dir: "output"
log_dir: "logs"
reactivation_trigger:
conditions:
churn_risk_flag: true
birthday_within_30_days: true
actions:
- send_email: emotional_reminder
- wait: 24h
- if_unopened:
send_sms: follow_up

View File

@@ -62,3 +62,59 @@ INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv
INFO:root:Loading CRM and clickstream data...
INFO:root:Hashing user IDs for pseudonymization...
INFO:root:Merging datasets...
INFO:root:Unified dataset saved to output\unified_dataset.csv

17
main.py
View File

@@ -6,9 +6,13 @@ from src.pipeline.list_hygiene import apply_list_hygiene
from src.pipeline.ethics_flags import apply_ethics_checks
from src.pipeline.deliverability_check import run_deliverability_check
print("=== WEEK 2 PIPELINE START ===")
from src.models.engagement_scoring import calculate_engagement_scores
from src.models.churn_classifier import classify_churn
from src.models.send_time_optimizer import optimize_send_time
from src.models.predictive_segments import build_predictive_segments
load_and_merge()
print("=== WEEK 2 PIPELINE START ===")
df= load_and_merge()
apply_birthday_recursion()
calculate_rf_score()
apply_list_hygiene()
@@ -17,3 +21,12 @@ apply_ethics_checks()
run_deliverability_check()
print("=== WEEK 2 PIPELINE COMPLETE ===")
print("=== WEEK 3 START ===")
df = calculate_engagement_scores(df)
df = classify_churn(df)
df = optimize_send_time(df)
df=build_predictive_segments(df)
print("=== WEEK 3 COMPLETE ===")

View File

@@ -1,10 +1,10 @@
user_id_x,email,birth_date,gender,consent_flag,user_id_hashed,user_id_y,timestamp,page_depth,dwell_time,session_id,birthday_trigger,recency_index,rf_score
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,28,0.034482758620689655
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,48,0.02040816326530612
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,38,0.02564102564102564
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,21,0.045454545454545456
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,160,0.006211180124223602
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,64,0.015384615384615385
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,13,0.07142857142857142
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,24,0.04
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,205,0.0048543689320388345
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,42,0.023255813953488372
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,62,0.015873015873015872
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,52,0.018867924528301886
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,35,0.027777777777777776
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,174,0.005714285714285714
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,78,0.012658227848101266
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,26,0.037037037037037035
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,37,0.02631578947368421
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,219,0.004545454545454545
1 user_id_x email birth_date gender consent_flag user_id_hashed user_id_y timestamp page_depth dwell_time session_id birthday_trigger recency_index rf_score
2 4 dana@outlook.com 1992-05-30 F True 4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a 4 2026-01-25 16:45:00+00:00 3 60 401 1992-04-30 28 42 0.034482758620689655 0.023255813953488372
3 3 charlie@tempmail.com 2000-11-21 M False 4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce 3 2026-01-05 12:30:00+00:00 2 30 301 2000-10-21 48 62 0.02040816326530612 0.015873015873015872
4 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-01-15 09:15:00+00:00 5 120 101 1990-01-17 38 52 0.02564102564102564 0.018867924528301886
5 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-02-01 10:20:00+00:00 3 90 102 1990-01-17 21 35 0.045454545454545456 0.027777777777777776
6 7 grace@no-reply.com 1993-03-15 F True 7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451 7 2025-09-15 17:10:00+00:00 2 45 701 1993-02-15 160 174 0.006211180124223602 0.005714285714285714
7 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2025-12-20 14:10:00+00:00 4 60 201 1985-06-05 64 78 0.015384615384615385 0.012658227848101266
8 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2026-02-10 08:50:00+00:00 6 180 202 1985-06-05 13 26 0.07142857142857142 0.037037037037037035
9 6 frank@gmail.com 1995-12-01 M True e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683 6 2026-01-30 09:00:00+00:00 4 120 601 1995-11-01 24 37 0.04 0.02631578947368421
10 5 eric@mailinator.com 1988-09-10 M True ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d 5 2025-08-01 11:00:00+00:00 5 150 501 1988-08-10 205 219 0.0048543689320388345 0.004545454545454545

View File

@@ -1,10 +1,10 @@
user_id_x,email,birth_date,gender,consent_flag,user_id_hashed,user_id_y,timestamp,page_depth,dwell_time,session_id,birthday_trigger,recency_index,rf_score
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,28,0.0344827586206896
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,48,0.0204081632653061
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,38,0.0256410256410256
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,21,0.0454545454545454
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,160,0.0062111801242236
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,64,0.0153846153846153
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,13,0.0714285714285714
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,24,0.04
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,205,0.0048543689320388
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,42,0.0232558139534883
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,62,0.0158730158730158
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,52,0.0188679245283018
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,35,0.0277777777777777
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,174,0.0057142857142857
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,78,0.0126582278481012
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,26,0.037037037037037
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,37,0.0263157894736842
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,219,0.0045454545454545
1 user_id_x email birth_date gender consent_flag user_id_hashed user_id_y timestamp page_depth dwell_time session_id birthday_trigger recency_index rf_score
2 4 dana@outlook.com 1992-05-30 F True 4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a 4 2026-01-25 16:45:00+00:00 3 60 401 1992-04-30 28 42 0.0344827586206896 0.0232558139534883
3 3 charlie@tempmail.com 2000-11-21 M False 4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce 3 2026-01-05 12:30:00+00:00 2 30 301 2000-10-21 48 62 0.0204081632653061 0.0158730158730158
4 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-01-15 09:15:00+00:00 5 120 101 1990-01-17 38 52 0.0256410256410256 0.0188679245283018
5 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-02-01 10:20:00+00:00 3 90 102 1990-01-17 21 35 0.0454545454545454 0.0277777777777777
6 7 grace@no-reply.com 1993-03-15 F True 7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451 7 2025-09-15 17:10:00+00:00 2 45 701 1993-02-15 160 174 0.0062111801242236 0.0057142857142857
7 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2025-12-20 14:10:00+00:00 4 60 201 1985-06-05 64 78 0.0153846153846153 0.0126582278481012
8 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2026-02-10 08:50:00+00:00 6 180 202 1985-06-05 13 26 0.0714285714285714 0.037037037037037
9 6 frank@gmail.com 1995-12-01 M True e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683 6 2026-01-30 09:00:00+00:00 4 120 601 1995-11-01 24 37 0.04 0.0263157894736842
10 5 eric@mailinator.com 1988-09-10 M True ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d 5 2025-08-01 11:00:00+00:00 5 150 501 1988-08-10 205 219 0.0048543689320388 0.0045454545454545

View File

@@ -1,10 +1,10 @@
user_id_x,email,birth_date,gender,consent_flag,user_id_hashed,user_id_y,timestamp,page_depth,dwell_time,session_id,birthday_trigger,recency_index,rf_score,deliverable
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,28,0.0344827586206896,True
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,48,0.0204081632653061,True
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,38,0.0256410256410256,True
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,21,0.0454545454545454,True
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,160,0.0062111801242236,True
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,64,0.0153846153846153,True
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,13,0.0714285714285714,True
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,24,0.04,True
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,205,0.0048543689320388,True
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,42,0.0232558139534883,True
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,62,0.0158730158730158,True
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,52,0.0188679245283018,True
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,35,0.0277777777777777,True
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,174,0.0057142857142857,True
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,78,0.0126582278481012,True
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,26,0.037037037037037,True
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,37,0.0263157894736842,True
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,219,0.0045454545454545,True
1 user_id_x email birth_date gender consent_flag user_id_hashed user_id_y timestamp page_depth dwell_time session_id birthday_trigger recency_index rf_score deliverable
2 4 dana@outlook.com 1992-05-30 F True 4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a 4 2026-01-25 16:45:00+00:00 3 60 401 1992-04-30 28 42 0.0344827586206896 0.0232558139534883 True
3 3 charlie@tempmail.com 2000-11-21 M False 4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce 3 2026-01-05 12:30:00+00:00 2 30 301 2000-10-21 48 62 0.0204081632653061 0.0158730158730158 True
4 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-01-15 09:15:00+00:00 5 120 101 1990-01-17 38 52 0.0256410256410256 0.0188679245283018 True
5 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-02-01 10:20:00+00:00 3 90 102 1990-01-17 21 35 0.0454545454545454 0.0277777777777777 True
6 7 grace@no-reply.com 1993-03-15 F True 7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451 7 2025-09-15 17:10:00+00:00 2 45 701 1993-02-15 160 174 0.0062111801242236 0.0057142857142857 True
7 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2025-12-20 14:10:00+00:00 4 60 201 1985-06-05 64 78 0.0153846153846153 0.0126582278481012 True
8 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2026-02-10 08:50:00+00:00 6 180 202 1985-06-05 13 26 0.0714285714285714 0.037037037037037 True
9 6 frank@gmail.com 1995-12-01 M True e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683 6 2026-01-30 09:00:00+00:00 4 120 601 1995-11-01 24 37 0.04 0.0263157894736842 True
10 5 eric@mailinator.com 1988-09-10 M True ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d 5 2025-08-01 11:00:00+00:00 5 150 501 1988-08-10 205 219 0.0048543689320388 0.0045454545454545 True

View File

@@ -1,10 +1,10 @@
user_id_x,email,birth_date,gender,consent_flag,user_id_hashed,user_id_y,timestamp,page_depth,dwell_time,session_id,birthday_trigger,recency_index,rf_score,sensitive_flag
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,28,0.0344827586206896,False
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,48,0.0204081632653061,False
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,38,0.0256410256410256,False
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,21,0.0454545454545454,False
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,160,0.0062111801242236,False
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,64,0.0153846153846153,False
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,13,0.0714285714285714,False
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,24,0.04,False
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,205,0.0048543689320388,False
4,dana@outlook.com,1992-05-30,F,True,4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a,4,2026-01-25 16:45:00+00:00,3,60,401,1992-04-30,42,0.0232558139534883,False
3,charlie@tempmail.com,2000-11-21,M,False,4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce,3,2026-01-05 12:30:00+00:00,2,30,301,2000-10-21,62,0.0158730158730158,False
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-01-15 09:15:00+00:00,5,120,101,1990-01-17,52,0.0188679245283018,False
1,alice@gmail.com,1990-02-17,F,True,6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b,1,2026-02-01 10:20:00+00:00,3,90,102,1990-01-17,35,0.0277777777777777,False
7,grace@no-reply.com,1993-03-15,F,True,7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451,7,2025-09-15 17:10:00+00:00,2,45,701,1993-02-15,174,0.0057142857142857,False
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2025-12-20 14:10:00+00:00,4,60,201,1985-06-05,78,0.0126582278481012,False
2,bob@yahoo.com,1985-07-05,M,True,d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35,2,2026-02-10 08:50:00+00:00,6,180,202,1985-06-05,26,0.037037037037037,False
6,frank@gmail.com,1995-12-01,M,True,e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683,6,2026-01-30 09:00:00+00:00,4,120,601,1995-11-01,37,0.0263157894736842,False
5,eric@mailinator.com,1988-09-10,M,True,ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d,5,2025-08-01 11:00:00+00:00,5,150,501,1988-08-10,219,0.0045454545454545,False
1 user_id_x email birth_date gender consent_flag user_id_hashed user_id_y timestamp page_depth dwell_time session_id birthday_trigger recency_index rf_score sensitive_flag
2 4 dana@outlook.com 1992-05-30 F True 4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a 4 2026-01-25 16:45:00+00:00 3 60 401 1992-04-30 28 42 0.0344827586206896 0.0232558139534883 False
3 3 charlie@tempmail.com 2000-11-21 M False 4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce 3 2026-01-05 12:30:00+00:00 2 30 301 2000-10-21 48 62 0.0204081632653061 0.0158730158730158 False
4 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-01-15 09:15:00+00:00 5 120 101 1990-01-17 38 52 0.0256410256410256 0.0188679245283018 False
5 1 alice@gmail.com 1990-02-17 F True 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b 1 2026-02-01 10:20:00+00:00 3 90 102 1990-01-17 21 35 0.0454545454545454 0.0277777777777777 False
6 7 grace@no-reply.com 1993-03-15 F True 7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451 7 2025-09-15 17:10:00+00:00 2 45 701 1993-02-15 160 174 0.0062111801242236 0.0057142857142857 False
7 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2025-12-20 14:10:00+00:00 4 60 201 1985-06-05 64 78 0.0153846153846153 0.0126582278481012 False
8 2 bob@yahoo.com 1985-07-05 M True d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35 2 2026-02-10 08:50:00+00:00 6 180 202 1985-06-05 13 26 0.0714285714285714 0.037037037037037 False
9 6 frank@gmail.com 1995-12-01 M True e7f6c011776e8db7cd330b54174fd76f7d0216b612387a5ffcfb81e6f0919683 6 2026-01-30 09:00:00+00:00 4 120 601 1995-11-01 24 37 0.04 0.0263157894736842 False
10 5 eric@mailinator.com 1988-09-10 M True ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d 5 2025-08-01 11:00:00+00:00 5 150 501 1988-08-10 205 219 0.0048543689320388 0.0045454545454545 False

View File

@@ -0,0 +1,30 @@
import numpy as np
def classify_churn(df):
inactivity_threshold = 30
drop_threshold = 20
df["engagement_drop"] = df["engagement_score"].diff().fillna(0)
df["churn_risk_flag"] = (
(df["days_since_last_session"] >= inactivity_threshold) &
(df["engagement_drop"] <= -drop_threshold)
)
def risk_class(row):
if row["churn_risk_flag"]:
return "High Risk"
elif row["engagement_score"] < 50:
return "Medium Risk"
else:
return "Low Risk"
df["risk_class"] = df.apply(risk_class, axis=1)
df["churn_probability_score"] = np.where(
df["risk_class"] == "High Risk", 0.8,
np.where(df["risk_class"] == "Medium Risk", 0.5, 0.2)
)
return df

View File

@@ -0,0 +1,46 @@
import pandas as pd
import numpy as np
def calculate_engagement_scores(df):
now = pd.Timestamp.now(tz="UTC")
df["timestamp"] = pd.to_datetime(df["timestamp"], utc=True, errors="coerce")
df["days_since_last_session"] = (now - df["timestamp"]).dt.days
df["session_count"] = df.groupby("user_id_hashed")["timestamp"].transform("count")
df["avg_session_duration"] = df.get("session_duration", 1)
df["click_engagement"] = df.get("click_through_rate", 0.5)
def normalize(series):
return (series - series.min()) / (series.max() - series.min() + 1e-9)
recency_norm = 1 - normalize(df["days_since_last_session"])
frequency_norm = normalize(df["session_count"])
duration_norm = normalize(df["avg_session_duration"])
click_norm = normalize(df["click_engagement"])
score = (
0.35 * recency_norm +
0.30 * frequency_norm +
0.20 * duration_norm +
0.15 * click_norm
)
df["engagement_score"] = (score * 100).round(2)
def classify(score):
if score >= 75:
return "Tier 1 - Highly Engaged"
elif score >= 50:
return "Tier 2 - Stable"
elif score >= 25:
return "Tier 3 - Declining"
else:
return "Tier 4 - At Risk"
df["engagement_tier"] = df["engagement_score"].apply(classify)
return df

View File

@@ -0,0 +1,21 @@
import pandas as pd
def build_predictive_segments(df):
# Default segment
df["predictive_segment"] = "General"
# At-risk + birthday approaching
df.loc[
(df.get("risk_class") == "High Risk") &
(df.get("birthday_within_30_days", False) == True),
"predictive_segment"
] = "At-Risk + Birthday Approaching"
# High-value users
df.loc[
(df.get("engagement_tier") == "Tier 1 - Highly Engaged"),
"predictive_segment"
] = "High Value - Amplify"
return df

View File

@@ -0,0 +1,21 @@
import pandas as pd
def optimize_send_time(df):
# Safety check
required_cols = ["timestamp", "click_engagement"]
for col in required_cols:
if col not in df.columns:
raise ValueError(f"{col} missing for send-time optimization")
df["hour"] = pd.to_datetime(df["timestamp"], utc=True).dt.hour
optimal_hours = (
df.groupby("hour")["click_engagement"]
.mean()
.idxmax()
)
df["optimal_send_hour"] = optimal_hours
return df