
"Begin With The End in Mind" - Stephen Covey
Welcome to Data Mining CRM! This blog documents lessons learned applying various data science and machine learning techniques to Customer Relationship Management (CRM) data.
Salesforce.com CRM and Weka are my primary tools, both of which have free Developer tools available. Click on the "Resources" page for links to the tools discussed in this blog.
Audience
My interests span both business and technology, as such there are 3 audiences for this blog:
Financial Decision Makers
CFOs, Finance Executives, or Board Members who have a fiscal or fiduciary responsibility to an organization. Blog posts categorized as "Financial" will explore the ROI of data mining and how to setup data mining initiatives for success.
Business Decision Makers
Line of business leaders; VP of Sales, Analysts, and other business users. Blog entries tagged "Business" will "begin with the end in mind" to first identify business objectives to be achieved, then work backwards to apply data mining techniques.
Technical Decision Makers
Developers, Analysts, Architects, Data Scientists, Statisticians; anyone who gets hands on with implementing data mining and machine learning technology. Blog entries tagged "Technical" will explore the full lifecycle of data mining; from building training data sets, classifying, making predictions, and operationally making data mining a repeatable process.
Personal Journey to Data Mining
My personal journey to data mining began with attempts at applying Edward Tufte's information architecture print techniques to web dashboard designs and working backwards to understand how data must be structured to support rich analytics visualizations. This evolved into developing tools for analyzing site uptime logs, dabbling in predicting system behaviors, and developing a log analytics service (Logalytics.io).
Several years were spent learning how to prepare and filter data so that it can be analyzed (NYTimes says about 50%-80% of data mining is "janitor work"... and yes, that's true).
Tufte's multi-variate visualizations help humans identify patterns and correlations that are not evident by looking at the raw data. Can computers be trained to identify these patterns? If so, what is the future impact on CRM dashboard design and information architecture?
Identifying potential customer opportunities involves creating reports and dashboards that apply some commonly understand correlations; "Show me all customers who have spent in excess of X dollars over the past Y months" or "Show me all customers who have opened a newsletter email or clicked on a particular link for a particular campaign".
But what correlations are we missing? There's just too much data today for the classic analytics model to scale. Big data gets bigger everyday. Can we just dump all available customer data into a magic machine and have it reveal undiscovered correlations?
In my pursuit to answer these questions, I attended the Stanford online learning course for machine learning; which provides deep exposure to the statistical foundations of machine learning and artificial intelligence. However, my end goal of developing interactive, CRM-oriented, dashboards required a more practical approach to data mining, which I ultimately discovered through University of Waikato's online Weka courses. Weka's use of Java, coupled with some Marketing related learning recipes, provides a pragmatic approach to data mining CRM.
Next Steps
Machine learning (ML) recommendation engines were built into the foundation of the above mentioned brands, which gave them staggering competitive advantages. The travel and financial industries are experiencing churn as ML-focused services are making exceptionally relevant predictions on customer demands and disrupting previously established business models.
We live in an extremely dynamic society where a 360° view of the Customer involves data from CRM, ERP, social, mobile, Internet of Things sensor streams, and a variety of other systems of engagement. Data mining is our only hope to make sense of it all and evolve the craft of customer relationship management. I hope you'll actively comment on these blog entries and share in this journey.
(ps: Converting this blog into a book is an eventual goal. Therefore, I will be occasionally revisiting some posts and editing for brevity, or enhancing with diagrams. Apologies in advance if this iterative approach to blogging results in some comments or inbound references appearing slightly out of context. I'll do my best to mention article changes within the comments.)
Welcome to Data Mining CRM! This blog documents lessons learned applying various data science and machine learning techniques to Customer Relationship Management (CRM) data.
Salesforce.com CRM and Weka are my primary tools, both of which have free Developer tools available. Click on the "Resources" page for links to the tools discussed in this blog.
Audience
My interests span both business and technology, as such there are 3 audiences for this blog:
Financial Decision Makers
CFOs, Finance Executives, or Board Members who have a fiscal or fiduciary responsibility to an organization. Blog posts categorized as "Financial" will explore the ROI of data mining and how to setup data mining initiatives for success.
Business Decision Makers
Line of business leaders; VP of Sales, Analysts, and other business users. Blog entries tagged "Business" will "begin with the end in mind" to first identify business objectives to be achieved, then work backwards to apply data mining techniques.
Technical Decision Makers
Developers, Analysts, Architects, Data Scientists, Statisticians; anyone who gets hands on with implementing data mining and machine learning technology. Blog entries tagged "Technical" will explore the full lifecycle of data mining; from building training data sets, classifying, making predictions, and operationally making data mining a repeatable process.
Personal Journey to Data Mining
My personal journey to data mining began with attempts at applying Edward Tufte's information architecture print techniques to web dashboard designs and working backwards to understand how data must be structured to support rich analytics visualizations. This evolved into developing tools for analyzing site uptime logs, dabbling in predicting system behaviors, and developing a log analytics service (Logalytics.io).
Several years were spent learning how to prepare and filter data so that it can be analyzed (NYTimes says about 50%-80% of data mining is "janitor work"... and yes, that's true).
Tufte's multi-variate visualizations help humans identify patterns and correlations that are not evident by looking at the raw data. Can computers be trained to identify these patterns? If so, what is the future impact on CRM dashboard design and information architecture?
Identifying potential customer opportunities involves creating reports and dashboards that apply some commonly understand correlations; "Show me all customers who have spent in excess of X dollars over the past Y months" or "Show me all customers who have opened a newsletter email or clicked on a particular link for a particular campaign".
But what correlations are we missing? There's just too much data today for the classic analytics model to scale. Big data gets bigger everyday. Can we just dump all available customer data into a magic machine and have it reveal undiscovered correlations?
In my pursuit to answer these questions, I attended the Stanford online learning course for machine learning; which provides deep exposure to the statistical foundations of machine learning and artificial intelligence. However, my end goal of developing interactive, CRM-oriented, dashboards required a more practical approach to data mining, which I ultimately discovered through University of Waikato's online Weka courses. Weka's use of Java, coupled with some Marketing related learning recipes, provides a pragmatic approach to data mining CRM.
Next Steps
- Amazon.com "People who bought X also bought Y"
- Netflix.com "Recommended movies for you"
- Google search results
- YouTube recommended videos
- Facebook activity feed and targeted ads
Machine learning (ML) recommendation engines were built into the foundation of the above mentioned brands, which gave them staggering competitive advantages. The travel and financial industries are experiencing churn as ML-focused services are making exceptionally relevant predictions on customer demands and disrupting previously established business models.
We live in an extremely dynamic society where a 360° view of the Customer involves data from CRM, ERP, social, mobile, Internet of Things sensor streams, and a variety of other systems of engagement. Data mining is our only hope to make sense of it all and evolve the craft of customer relationship management. I hope you'll actively comment on these blog entries and share in this journey.
(ps: Converting this blog into a book is an eventual goal. Therefore, I will be occasionally revisiting some posts and editing for brevity, or enhancing with diagrams. Apologies in advance if this iterative approach to blogging results in some comments or inbound references appearing slightly out of context. I'll do my best to mention article changes within the comments.)