{"id":992,"date":"2020-09-04T16:04:31","date_gmt":"2020-09-04T16:04:31","guid":{"rendered":"https:\/\/data-science.gotoauthority.com\/2020\/09\/04\/data-scientists-think-data-is-their-1-problem-heres-why-theyre-wrong\/"},"modified":"2020-09-04T16:04:31","modified_gmt":"2020-09-04T16:04:31","slug":"data-scientists-think-data-is-their-1-problem-heres-why-theyre-wrong","status":"publish","type":"post","link":"https:\/\/wealthrevelation.com\/data-science\/2020\/09\/04\/data-scientists-think-data-is-their-1-problem-heres-why-theyre-wrong\/","title":{"rendered":"Data Scientists think data is their #1 problem. Here\u2019s why they\u2019re wrong."},"content":{"rendered":"<div id=\"post-\">\n<p><b>By <a href=\"https:\/\/www.linkedin.com\/in\/jamestaylor\/\" target=\"_blank\" rel=\"noopener noreferrer\">James Taylor<\/a>, CEO and leading authority on Digital Decisioning and delivering business impact from AI and machine learning<\/b>.<\/p>\n<p><img class=\"aligncenter size-large\" src=\"https:\/\/media-exp1.licdn.com\/dms\/image\/C5612AQFeB9xH20D9ig\/article-cover_image-shrink_720_1280\/0?e=1604534400&amp;v=beta&amp;t=8Vz95r15nARoQbsTagOCnJDa0UWti8ONCmI-b8x9b1o\" width=\"90%\"><\/p>\n<p>I often see articles or posts that identify data integration or preparation as the key issues facing data science projects. This always puzzles me as this is not our lived experience &#8211; not what we see when we work with Fortune 500 companies adopting predictive analytics, machine learning, or AI. But I think I have figured it out. The problem is as follows:<\/p>\n<blockquote>\n<p><em>What data scientists think counts as a &#8220;data science project&#8221; is not, in fact, a data science project.<\/em><\/p>\n<\/blockquote>\n<p>Let me illustrate this with some data from a great study. Back in 2016, the Economist Information Unit did a survey on &#8220;<a href=\"https:\/\/eiuperspectives.economist.com\/marketing\/broken-links-why-analytics-investments-have-yet-pay\" target=\"_blank\" rel=\"noopener noreferrer\">Broken links: Why analytics investments have yet to pay off<\/a>&#8221; and below, you see how this data appears to support the argument that data problems are #1.<\/p>\n<p><img class=\"aligncenter size-large\" src=\"https:\/\/media-exp1.licdn.com\/dms\/image\/C5612AQGvuc-5zD7TqQ\/article-inline_image-shrink_1000_1488\/0?e=1604534400&amp;v=beta&amp;t=JgIxGzB6uEcHxe0deeCFO0PyOWztazk1O6Qoe53k-rM\" width='\"90%'><\/p>\n<p>Wow &#8211; pretty clear that Data integration\/preparation is the biggest problem, with nearly twice as many projects reporting it as a problem as the next one.<\/p>\n<p>In fact, though, this is a subset of the data from the survey. Here&#8217;s the full data set:<\/p>\n<p><img class=\"aligncenter size-large\" src=\"https:\/\/media-exp1.licdn.com\/dms\/image\/C5612AQHtzxuflwowPA\/article-inline_image-shrink_1000_1488\/0?e=1604534400&amp;v=beta&amp;t=ugcaCwK1yTRNNGIGbR729Z1C7vuhozGYalC2W-M7pFg\" width=\"90%\"><\/p>\n<p>Data integration and preparation only ranks\u00a0<strong>#4<\/strong>. Problem definition\/framing, Solution approach\/design, and Action\/change management all rank higher. This is our experience.<\/p>\n<p>In large, established &#8220;grown-up&#8221; companies, data science projects fail for one or both of two reasons:<\/p>\n<ul>\n<li>They are solving the wrong problem. They are building an analytic that is not what the business need, that will not solve a true business problem, or that is poorly designed to fit into the business context.<\/li>\n<li>Because they cannot action the model they build. They can&#8217;t change the business decision making to take advantage of the analytic by changing the decisions made and actions taken.<\/li>\n<\/ul>\n<p>And this illustrates the problem.<\/p>\n<p>The problem is that data scientists THINK their project starts with data and ends with the communication of their analysis. If that&#8217;s your focus, then data is your #1 problem.<\/p>\n<p>But this is not where data science projects start nor where they end. They have to start and end with the\u00a0<strong>business<\/strong>. That means starting with a\u00a0<strong>business\u00a0<\/strong>problem &#8211; a business decision that the business wants to improve &#8211; and ending with that problem being solved &#8211; the\u00a0<strong>business\u00a0<\/strong>behaves differently (better). If that&#8217;s your focus, then your problem is not data but problem definition and operationalization &#8211; making the analytic work IRL.<\/p>\n<p>Here&#8217;s the difference shown in those phases. On the left, what many data scientists think their projects involved, and on the right, what it really involves.<\/p>\n<p><img class=\"aligncenter size-large\" src=\"https:\/\/media-exp1.licdn.com\/dms\/image\/C5612AQE64LfBl4kwlw\/article-inline_image-shrink_1000_1488\/0?e=1604534400&amp;v=beta&amp;t=X2tCRPRfx7upF8crcDt9rGZ20OOOMA7euUUhgGQM408\" width=\"90%\"><\/p>\n<blockquote>\n<p><em>Bottom line: If your data science team is telling you that data is their #1 problem, then they&#8217;re doing it wrong.<\/em><\/p>\n<\/blockquote>\n<p>I&#8217;ve written about this before &#8211; check out this\u00a0<a href=\"https:\/\/www.linkedin.com\/pulse\/fixing-broken-links-analytics-value-chain-james-taylor\/\" target=\"_blank\" rel=\"noopener noreferrer\">article on the study itself<\/a>\u00a0and this one on\u00a0<a href=\"https:\/\/www.linkedin.com\/pulse\/adopt-decision-modeling-decisionsfirst-analytic-success-james-taylor\" target=\"_blank\" rel=\"noopener noreferrer\">adopting decision modeling<\/a>\u00a0as a better way to define the problems your data science team is trying to solve. You might also like our recent white paper and videos on\u00a0<a href=\"https:\/\/www.decisionmanagementsolutions.com\/analytic-enterprise\/\" target=\"_blank\" rel=\"noopener noreferrer\">Building an Analytic Enterprise<\/a>.<\/p>\n<p><a href=\"https:\/\/www.linkedin.com\/pulse\/data-scientists-think-1-problem-heres-why-theyre-wrong-james-taylor\/\">Original<\/a>. Reposted with permission.<\/p>\n<p><b>Related:<\/b><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/www.kdnuggets.com\/2020\/09\/data-scientist-data-problem-wrong.html<\/p>\n","protected":false},"author":0,"featured_media":993,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/992"}],"collection":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/comments?post=992"}],"version-history":[{"count":0,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/992\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media\/993"}],"wp:attachment":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media?parent=992"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/categories?post=992"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/tags?post=992"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}