Movatterモバイル変換


[0]ホーム

URL:


SlideShare a Scribd company logo

Predicting User Churn for a Digital Health App

Download as PPTX, PDF
1 like237 views
Nadaa Taiyab
Nadaa Taiyab

Diabesties was an iPhone app built to help college students with Type 1 Diabetes better manage their condition through tracking and social support. This project uses machine learning algorithms to predict user churn after a user's first week of engagement.View the github repo here to see the full analysis and code:www.github.com/nadaataiyab/diabestiesA youtube vide of the live presentation is available here:https://www.youtube.com/watch?v=6jJtakvCEqA&t=1s

1 of 31
Download to read offline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Predicting User Churn for a Digital Health AppGalvanize Data Science ImmersiveCapstone ProjectPresentation to Ayogo, Oct 2017Nadaa Taiyab
BACKGROUND• Investment Banking• International Development• Health Tech Startups (Finance & Operations)• Health Coaching
DIABESTIES APP• IPhone app built by Ayogo in partnershipwith the College Diabetes Network
P• Track glucose, insulin,carbs, and mood• Share data with your“diabestie”• Export data
PROJECT:PREDICT USER CHURNUse machine learningalgorithms to predict whichusers would stop using theapp
WHY CHURN MODELING?• Increase persistence by intervening with users atrisk of churn (eg. phone call, email, reminder,etc.)• Reduce cost and improve effectiveness ofinterventions• Instructive for future app design and predictivemodeling
TECHNOLOGIESseaborn
DATA• MySQL Database• 3 years of data (2012-2015)
3080Users51,125Logs387,188ClicksUser Data(Demographics)BehavioralData
User DataWhat can we learn about the users?
30% were college age (18-25), but the average age was37.
51% had Type 1 Diabetes and 42% had Type IIDiabetes.
58% of participants werefemale
50% would not specify ethnicity, so hard to draw firmconclusions,but likely that majority were caucasian. and living in US.
DEMOGRAPHICS• Many of actual participants differed fromintended target of college students with Type 1Diabetes• Huge age range, including a lot of people in latethirties• Ethnicity and location inconclusive, but likelymajority caucasian in the US
BehavioralDataWhat can we learn about how users interactedwith the app?
Used mainly as a glucose tracker.Insulin and mood tracked 30% of thetime.97% of entries includedglucose log
Many users tried the app a few times and quit.A few users logged hundreds of times over 2-3 years.85% entered data 0-10times15% entered more than 10times
23% of users were active more than 30 days, though only 15%logged more than 10 times, indicating stop-start usage pattern.
DIABESTIE EFFECT• Only 3% of users had a diabestie• Of users with a diabestie, 44% used the appfor more than a week• Diabestie didn’t seem to affect churn
BEHAVIORAL DATA• Mainly used a glucose tracker• Small core of 10-15% of users that are persistent• Diabestie no effect
Churn ModelCan we use machine learningalgorithms to predict churn?
FEATURE ENGINEERING• Behavioral Data• # page views• # total entries• # moods,• # notes• # A1C updates• User Data• age, gender,ethnicity,education• diabetes type• diabestie: yes / no
CHURN DEFINITIONUsers who completed less than tenadditional entries after the first week ofuse
CLASSIFIER MODELS• Logistic Regression• Random Forest• Gradient Boosted Trees• AdaBoost
Gradient Boosted Trees had highest Area Under the Curve(AUC), indicating that it had the most predictive powerTrue Positive Rate (Sensitivity) = True Positives / (True Positives + False Negatives)False Positive Rate = 1 - (True Negatives / (True Negatives + False Positives))
MODEL SCORESLogisticRegressionRandomForestGradientBoost AdaBoostAccuracy 92% 92% 94% 93%Precision 93% 95% 95% 95%Recall 98% 96% 98% 97%AUC = Area Under theCurveAccuracy = (True Positives + True Negatives) / Total SampleSizeRecall or Sensitivity = True Positive / (True Positive + FalseNegative)Precision = True Positives / (True Positive + FalsePositive)
CONFUSION MATRIXn= 770 PredictedChurnPredictedNot ChurnActualChurnTruePositive692FalseNegative16708ActualNot ChurnFalsePositive34TrueNegative2862726 44Results from Gradient BoostedTrees
Behavioral features had greater predictive power thandemographicsexcept for age.Feature importance scores calculatedusingRandom Forest algorithm
TAKE AWAYS• How an app is used can evolve in unexpected ways• Behavioral data has more predictive power than demographics• Improve Data Quality• Structured data for location (maybe for a1c and glucose too)• Safeguards against data entry errors• Timestamp data important• Better app usage data• Possibly improve model by only including users that made at least one entry
Nadaa Taiyabnadaa.taiyab@gmail.comgithub.com/nadaataiyab/diabesties

Recommended

PPTX
Tiny stepsworkshopcei v1
Nadaa Taiyab
 
PDF
Tiny Steps to healthier eating for more energy, creativity, and productivity
Nadaa Taiyab
 
PDF
Tiny Steps to More Energy
Nadaa Taiyab
 
PDF
Eating Healthy While Working at a Tech Company
Nadaa Taiyab
 
PDF
2024 Trend Updates: What Really Works In SEO & Content Marketing
Search Engine Journal
 
PDF
Storytelling For The Web: Integrate Storytelling in your Design Process
Chiara Aliotta
 
PDF
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
OECD Directorate for Financial and Enterprise Affairs
 
PDF
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...
SocialHRCamp
 
PDF
01Jul25 ILC Europe Network conference slides
ILC- UK
 
PDF
SULCI, GYRI & FUNCTIONAL AREAS OF CEREBRUM-Prof.Dr.N.Mugunthan KMMC.pdf
Kanyakumari Medical Mission Research Center, Muttom
 
PPTX
Errata and Book reviews: PGMEE in a Nutshell
Dr. Aryan (Anish Dhakal)
 
DOCX
Neuroglia - Type of nervous tissue .docx
Ayesha Fatima
 
PPTX
Maxillary Sinus.pptx 1st bds lecture histology
drpavansanthoshmds
 
PPTX
Philosophical Historiography of Epidemiology.pptx
SwastikaPoudel
 
PPTX
Presentation on Ankylosing Spondylitis BY DR AVIJIT AND DR WAHED
DR AVIJIT DAS
 
PDF
Cardiovascular Physiology - Principles of Hemodynamics
MedicoseAcademics
 
PPTX
Bill Faloon's Presentation Slides at RAADfest 2025
maximuspeto
 
PPT
5. Body Fluids electrolytes imbalance shock .ppt ·.ppt
Bolan University of Medical and Health Sciences ,Quetta
 
PDF
Development and validation of the PRISM Scale for Tomorrowmind
Yoga Tokuyoshi
 
PDF
MSN 02.pdf book for bsc nursing and compatibility
nileshrathore436
 
PPTX
chronic diarrhea-Ahmad Salih Naamat.pptx
drne3mat1986
 
PPTX
Cancer - Treatment Modalities, Principles of cancer chemotherapy.pptx
Ayesha Fatima
 
PPTX
ERYTHROMELALGIA - Edited.pptx An Overview and Treatment Updates
Ade Wijaya
 
PDF
CARDIAC LIFE SUPPORT - Jagadish N. BSN RN
Jagadish N. BSN RN
 
PDF
RGUHS BSc Nursing Nutrition Notes, All types of question answers are availabl...
healthscedu
 
PPT
Nursing Strategies in Transthyretin Cardiac Amyloidosis: Targeted Therapies a...
PVI, PeerView Institute for Medical Education
 
PDF
Complete Eye Exams in Kitchener for Personalized Vision Care
Romin Optical
 
PDF
RGUHS BSc Nursing, Fundamentals of Nursing / Nursing Foundation Notes, All ty...
healthscedu
 
PDF
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
PDF
Everything You Need To Know About ChatGPT
Expeed Software
 

More Related Content

Recently uploaded(20)

PDF
01Jul25 ILC Europe Network conference slides
ILC- UK
 
PDF
SULCI, GYRI & FUNCTIONAL AREAS OF CEREBRUM-Prof.Dr.N.Mugunthan KMMC.pdf
Kanyakumari Medical Mission Research Center, Muttom
 
PPTX
Errata and Book reviews: PGMEE in a Nutshell
Dr. Aryan (Anish Dhakal)
 
DOCX
Neuroglia - Type of nervous tissue .docx
Ayesha Fatima
 
PPTX
Maxillary Sinus.pptx 1st bds lecture histology
drpavansanthoshmds
 
PPTX
Philosophical Historiography of Epidemiology.pptx
SwastikaPoudel
 
PPTX
Presentation on Ankylosing Spondylitis BY DR AVIJIT AND DR WAHED
DR AVIJIT DAS
 
PDF
Cardiovascular Physiology - Principles of Hemodynamics
MedicoseAcademics
 
PPTX
Bill Faloon's Presentation Slides at RAADfest 2025
maximuspeto
 
PPT
5. Body Fluids electrolytes imbalance shock .ppt ·.ppt
Bolan University of Medical and Health Sciences ,Quetta
 
PDF
Development and validation of the PRISM Scale for Tomorrowmind
Yoga Tokuyoshi
 
PDF
MSN 02.pdf book for bsc nursing and compatibility
nileshrathore436
 
PPTX
chronic diarrhea-Ahmad Salih Naamat.pptx
drne3mat1986
 
PPTX
Cancer - Treatment Modalities, Principles of cancer chemotherapy.pptx
Ayesha Fatima
 
PPTX
ERYTHROMELALGIA - Edited.pptx An Overview and Treatment Updates
Ade Wijaya
 
PDF
CARDIAC LIFE SUPPORT - Jagadish N. BSN RN
Jagadish N. BSN RN
 
PDF
RGUHS BSc Nursing Nutrition Notes, All types of question answers are availabl...
healthscedu
 
PPT
Nursing Strategies in Transthyretin Cardiac Amyloidosis: Targeted Therapies a...
PVI, PeerView Institute for Medical Education
 
PDF
Complete Eye Exams in Kitchener for Personalized Vision Care
Romin Optical
 
PDF
RGUHS BSc Nursing, Fundamentals of Nursing / Nursing Foundation Notes, All ty...
healthscedu
 
01Jul25 ILC Europe Network conference slides
ILC- UK
 
SULCI, GYRI & FUNCTIONAL AREAS OF CEREBRUM-Prof.Dr.N.Mugunthan KMMC.pdf
Kanyakumari Medical Mission Research Center, Muttom
 
Errata and Book reviews: PGMEE in a Nutshell
Dr. Aryan (Anish Dhakal)
 
Neuroglia - Type of nervous tissue .docx
Ayesha Fatima
 
Maxillary Sinus.pptx 1st bds lecture histology
drpavansanthoshmds
 
Philosophical Historiography of Epidemiology.pptx
SwastikaPoudel
 
Presentation on Ankylosing Spondylitis BY DR AVIJIT AND DR WAHED
DR AVIJIT DAS
 
Cardiovascular Physiology - Principles of Hemodynamics
MedicoseAcademics
 
Bill Faloon's Presentation Slides at RAADfest 2025
maximuspeto
 
5. Body Fluids electrolytes imbalance shock .ppt ·.ppt
Bolan University of Medical and Health Sciences ,Quetta
 
Development and validation of the PRISM Scale for Tomorrowmind
Yoga Tokuyoshi
 
MSN 02.pdf book for bsc nursing and compatibility
nileshrathore436
 
chronic diarrhea-Ahmad Salih Naamat.pptx
drne3mat1986
 
Cancer - Treatment Modalities, Principles of cancer chemotherapy.pptx
Ayesha Fatima
 
ERYTHROMELALGIA - Edited.pptx An Overview and Treatment Updates
Ade Wijaya
 
CARDIAC LIFE SUPPORT - Jagadish N. BSN RN
Jagadish N. BSN RN
 
RGUHS BSc Nursing Nutrition Notes, All types of question answers are availabl...
healthscedu
 
Nursing Strategies in Transthyretin Cardiac Amyloidosis: Targeted Therapies a...
PVI, PeerView Institute for Medical Education
 
Complete Eye Exams in Kitchener for Personalized Vision Care
Romin Optical
 
RGUHS BSc Nursing, Fundamentals of Nursing / Nursing Foundation Notes, All ty...
healthscedu
 

Featured(20)

PDF
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
PDF
Everything You Need To Know About ChatGPT
Expeed Software
 
PDF
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
PDF
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
PDF
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
PDF
Skeleton Culture Code
Skeleton Technologies
 
PDF
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
PDF
Content Methodology: A Best Practices Report (Webinar)
contently
 
PPTX
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
PDF
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
PDF
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
PDF
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
PDF
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
PDF
Getting into the tech field. what next
Tessa Mero
 
PDF
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
PDF
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
PDF
Introduction to Data Science
Christy Abraham Joy
 
PDF
Time Management & Productivity - Best Practices
Vit Horky
 
PDF
The six step guide to practical project management
MindGenius
 
PDF
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
Everything You Need To Know About ChatGPT
Expeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 

Predicting User Churn for a Digital Health App

  • 1.Predicting User Churn for a Digital Health AppGalvanize Data Science ImmersiveCapstone ProjectPresentation to Ayogo, Oct 2017Nadaa Taiyab
  • 2.BACKGROUND• Investment Banking• International Development• Health Tech Startups (Finance & Operations)• Health Coaching
  • 3.DIABESTIES APP• IPhone app built by Ayogo in partnershipwith the College Diabetes Network
  • 4.P• Track glucose, insulin,carbs, and mood• Share data with your“diabestie”• Export data
  • 5.PROJECT:PREDICT USER CHURNUse machine learningalgorithms to predict whichusers would stop using theapp
  • 6.WHY CHURN MODELING?• Increase persistence by intervening with users atrisk of churn (eg. phone call, email, reminder,etc.)• Reduce cost and improve effectiveness ofinterventions• Instructive for future app design and predictivemodeling
  • 8.DATA• MySQL Database• 3 years of data (2012-2015)
  • 10.User DataWhat can we learn about the users?
  • 11.30% were college age (18-25), but the average age was37.
  • 12.51% had Type 1 Diabetes and 42% had Type IIDiabetes.
  • 13.58% of participants werefemale
  • 14.50% would not specify ethnicity, so hard to draw firmconclusions,but likely that majority were caucasian. and living in US.
  • 15.DEMOGRAPHICS• Many of actual participants differed fromintended target of college students with Type 1Diabetes• Huge age range, including a lot of people in latethirties• Ethnicity and location inconclusive, but likelymajority caucasian in the US
  • 16.BehavioralDataWhat can we learn about how users interactedwith the app?
  • 17.Used mainly as a glucose tracker.Insulin and mood tracked 30% of thetime.97% of entries includedglucose log
  • 18.Many users tried the app a few times and quit.A few users logged hundreds of times over 2-3 years.85% entered data 0-10times15% entered more than 10times
  • 19.23% of users were active more than 30 days, though only 15%logged more than 10 times, indicating stop-start usage pattern.
  • 20.DIABESTIE EFFECT• Only 3% of users had a diabestie• Of users with a diabestie, 44% used the appfor more than a week• Diabestie didn’t seem to affect churn
  • 21.BEHAVIORAL DATA• Mainly used a glucose tracker• Small core of 10-15% of users that are persistent• Diabestie no effect
  • 22.Churn ModelCan we use machine learningalgorithms to predict churn?
  • 23.FEATURE ENGINEERING• Behavioral Data• # page views• # total entries• # moods,• # notes• # A1C updates• User Data• age, gender,ethnicity,education• diabetes type• diabestie: yes / no
  • 24.CHURN DEFINITIONUsers who completed less than tenadditional entries after the first week ofuse
  • 25.CLASSIFIER MODELS• Logistic Regression• Random Forest• Gradient Boosted Trees• AdaBoost
  • 26.Gradient Boosted Trees had highest Area Under the Curve(AUC), indicating that it had the most predictive powerTrue Positive Rate (Sensitivity) = True Positives / (True Positives + False Negatives)False Positive Rate = 1 - (True Negatives / (True Negatives + False Positives))
  • 27.MODEL SCORESLogisticRegressionRandomForestGradientBoost AdaBoostAccuracy 92% 92% 94% 93%Precision 93% 95% 95% 95%Recall 98% 96% 98% 97%AUC = Area Under theCurveAccuracy = (True Positives + True Negatives) / Total SampleSizeRecall or Sensitivity = True Positive / (True Positive + FalseNegative)Precision = True Positives / (True Positive + FalsePositive)
  • 28.CONFUSION MATRIXn= 770 PredictedChurnPredictedNot ChurnActualChurnTruePositive692FalseNegative16708ActualNot ChurnFalsePositive34TrueNegative2862726 44Results from Gradient BoostedTrees
  • 29.Behavioral features had greater predictive power thandemographicsexcept for age.Feature importance scores calculatedusingRandom Forest algorithm
  • 30.TAKE AWAYS• How an app is used can evolve in unexpected ways• Behavioral data has more predictive power than demographics• Improve Data Quality• Structured data for location (maybe for a1c and glucose too)• Safeguards against data entry errors• Timestamp data important• Better app usage data• Possibly improve model by only including users that made at least one entry

[8]ページ先頭

©2009-2025 Movatter.jp