{"id":5897,"date":"2015-03-18T15:08:23","date_gmt":"2015-03-18T09:38:23","guid":{"rendered":"http:\/\/ivyproschool.com\/blog\/?p=5897"},"modified":"2025-04-01T16:51:13","modified_gmt":"2025-04-01T11:21:13","slug":"why-data-dredging-is-trending","status":"publish","type":"post","link":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/","title":{"rendered":"Why Data Dredging is trending"},"content":{"rendered":"<p style=\"text-align: center;\"><em>&#8220;If you torture the data long enough, it will confess to anything.&#8221; \u00a0\u00a0<\/em><em>&#8212; Ronald Coase, British economist<\/em><\/p>\n<p><em>\u00a0<\/em>If you thought DATA was only \u2018mined\u2019 and \u2018extracted\u2019 for analysis, take a look at this frequently used method of \u2018data dredging\u2019.<\/p>\n<p>As we move over from traditional eyeballing of statistical data to dig deeper into machine based techniques, the entire process of DATA extraction gets more technique based.<\/p>\n<p>One such DATA extraction practice is analysis of large volumes of data in the quest for ANY possible relationships. An example would be \u201cfishing\u201d in very large datasets to analyse crime clusters without understanding causation. Or say \u201csnooping\u201d into an App user\u2019s habits for finding correlations. \u00a0That is, combing data for patterns without pre-established hypotheses or objectives. Which sounds absurd, but may actually throw-up significant unseen relationships (what does the App user do at lunchtime when in the vicinity of Connaught Place, New Delhi?).<\/p>\n<p>With the evolution of Big Data a fundamentally different practice of experimental design has evolved. Formerly, the project \/ questions asked would decide what data to collect, for analysis of the same. Now, the low cost of data storage has caused a rethink with all kinds of data being collected first and then searched for significant patterns.<\/p>\n<p>This practice of \u201cdata dredging\u201d differs from traditional Data Mining practices.<\/p>\n<p><a href=\"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5898\" src=\"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png\" alt=\"Data dredging\" width=\"526\" height=\"332\" srcset=\"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png 479w, https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging-300x189.png 300w\" sizes=\"auto, (max-width: 526px) 100vw, 526px\" \/><\/a><strong>Data Dredging explained<\/strong><\/p>\n<p>Where the sample size is not truly representative, there is \u2018confounding\u2019 or \u2018selection bias\u2019, or there exists too many hypotheses for a given dataset, there may occur some highly correlated data that are statistically significant. Whereas, there is no effect between the variables and confidence level\u00a0is .05 (5%). This is a typical case of \u201cdata dredging\u201d with false positive findings, a result of looking at too many possible associations. One way to conquer errors \u00a0of \u201cdata dredging\u201d is being stringent with \u201csignificance\u201d levels, moving to P&lt;0.001 or beyond.<\/p>\n<p><strong>Applications of Data Dredging<\/strong><\/p>\n<ul>\n<li>Forensic Analysis<\/li>\n<li>Market Basket Analysis<\/li>\n<li>Risk Analysis<\/li>\n<li>Fraud detection<\/li>\n<li>Medical Science<\/li>\n<li>Public Health<\/li>\n<li>Clinical Research<\/li>\n<li>Digital Analytics<\/li>\n<li>Social Media<\/li>\n<\/ul>\n<p><strong>When does Data Dredging occur?<\/strong><\/p>\n<ul>\n<li>Failure to make adjustments for statistical effects of search in large models<\/li>\n<li>When there is statistical bias, confounding or misrepresentation of\u00a0 the P&lt;0.05 significance test<\/li>\n<li>When there is suboptimal model construction<\/li>\n<li>When there is \u2018Overfitting\u2019 of data<\/li>\n<li>When too many hypotheses are tested without proper statistical control<\/li>\n<li>When there is \u2018Oversearching\u2019 of relationships between variables<\/li>\n<li>Overestimation of model\u2019s accuracy<\/li>\n<li>When Data Mining technique is explicitly used to prove a particular pre-established point of view!<\/li>\n<\/ul>\n<p>So the next time you read such research findings like \u201c<em>Teens who eat lots of chocolate tend to be slimmer<\/em>\u201d &#8211; take it with a pinch of salt. Better, look at it as a possible consequence of distorted \u201cdata dredging\u201d!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;If you torture the data long enough, it will confess to anything.&#8221; \u00a0\u00a0&#8212; Ronald Coase, British economist \u00a0If you thought DATA was only \u2018mined\u2019 and \u2018extracted\u2019 for analysis, take a look at this frequently used method of \u2018data dredging\u2019. As we move over from traditional eyeballing of statistical data to dig deeper into machine based techniques, the entire process of DATA extraction gets more technique based. One such DATA extraction practice is analysis of large volumes of data in the quest for ANY possible relationships. An example would be \u201cfishing\u201d in very large datasets to analyse crime clusters without understanding causation. Or say \u201csnooping\u201d into an App user\u2019s habits for finding correlations. \u00a0That is, combing data for patterns without pre-established hypotheses or objectives. Which sounds absurd, but may actually throw-up significant unseen relationships (what does the App user do at lunchtime when in the vicinity of Connaught Place, New Delhi?). With the evolution of Big Data a fundamentally different practice of experimental design has evolved. Formerly, the project \/ questions asked would decide what data to collect, for analysis of the same. Now, the low cost of data storage has caused a rethink with all kinds of data being collected [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[601,467,530,600],"class_list":["post-5897","post","type-post","status-publish","format-standard","hentry","category-data-analytics","tag-data-dredging","tag-data-science","tag-term","tag-terminology"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Why Data Dredging is Trending in Analytics.<\/title>\n<meta name=\"description\" content=\"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Why Data Dredging is Trending in Analytics.\" \/>\n<meta property=\"og:description\" content=\"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/\" \/>\n<meta property=\"og:site_name\" content=\"R vs Python: Which Analytics Tool Should You Choose for Data Science?\" \/>\n<meta property=\"article:author\" content=\"https:\/\/facebook.com\/ivyproschool\" \/>\n<meta property=\"article:published_time\" content=\"2015-03-18T09:38:23+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-01T11:21:13+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png\" \/>\n<meta name=\"author\" content=\"Ivy Professional School\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@ivyproschool\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ivy Professional School\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/\"},\"author\":{\"name\":\"Ivy Professional School\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/#\\\/schema\\\/person\\\/31fdab8559dd3db99173764bfb60215d\"},\"headline\":\"Why Data Dredging is trending\",\"datePublished\":\"2015-03-18T09:38:23+00:00\",\"dateModified\":\"2025-04-01T11:21:13+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/\"},\"wordCount\":458,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/wp-content\\\/uploads\\\/2015\\\/03\\\/Data-dredging.png\",\"keywords\":[\"data dredging\",\"data science\",\"term\",\"terminology\"],\"articleSection\":[\"Data Analytics\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/\",\"url\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/\",\"name\":\"Why Data Dredging is Trending in Analytics.\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/wp-content\\\/uploads\\\/2015\\\/03\\\/Data-dredging.png\",\"datePublished\":\"2015-03-18T09:38:23+00:00\",\"dateModified\":\"2025-04-01T11:21:13+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/#\\\/schema\\\/person\\\/31fdab8559dd3db99173764bfb60215d\"},\"description\":\"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#primaryimage\",\"url\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/wp-content\\\/uploads\\\/2015\\\/03\\\/Data-dredging.png\",\"contentUrl\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/wp-content\\\/uploads\\\/2015\\\/03\\\/Data-dredging.png\",\"width\":479,\"height\":302},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/why-data-dredging-is-trending\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Why Data Dredging is trending\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/\",\"name\":\"Ivy Professional School | Official Blog\",\"description\":\"Confused between R and Python for your data science journey? Discover the key differences in data visualization, handling capabilities, speed, and ease of learning.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/#\\\/schema\\\/person\\\/31fdab8559dd3db99173764bfb60215d\",\"name\":\"Ivy Professional School\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g\",\"caption\":\"Ivy Professional School\"},\"sameAs\":[\"http:\\\/\\\/www.ivyproschool.com\",\"https:\\\/\\\/facebook.com\\\/ivyproschool\",\"https:\\\/\\\/instagram.com\\\/ivyproschool\",\"https:\\\/\\\/x.com\\\/ivyproschool\",\"https:\\\/\\\/youtube.com\\\/ivyproschool\"],\"url\":\"https:\\\/\\\/ivyproschool.com\\\/blog\\\/author\\\/prateek\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Why Data Dredging is Trending in Analytics.","description":"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/","og_locale":"en_US","og_type":"article","og_title":"Why Data Dredging is Trending in Analytics.","og_description":"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.","og_url":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/","og_site_name":"R vs Python: Which Analytics Tool Should You Choose for Data Science?","article_author":"https:\/\/facebook.com\/ivyproschool","article_published_time":"2015-03-18T09:38:23+00:00","article_modified_time":"2025-04-01T11:21:13+00:00","og_image":[{"url":"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png","type":"","width":"","height":""}],"author":"Ivy Professional School","twitter_card":"summary_large_image","twitter_creator":"@ivyproschool","twitter_misc":{"Written by":"Ivy Professional School","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#article","isPartOf":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/"},"author":{"name":"Ivy Professional School","@id":"https:\/\/ivyproschool.com\/blog\/#\/schema\/person\/31fdab8559dd3db99173764bfb60215d"},"headline":"Why Data Dredging is trending","datePublished":"2015-03-18T09:38:23+00:00","dateModified":"2025-04-01T11:21:13+00:00","mainEntityOfPage":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/"},"wordCount":458,"commentCount":0,"image":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#primaryimage"},"thumbnailUrl":"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png","keywords":["data dredging","data science","term","terminology"],"articleSection":["Data Analytics"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/","url":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/","name":"Why Data Dredging is Trending in Analytics.","isPartOf":{"@id":"https:\/\/ivyproschool.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#primaryimage"},"image":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#primaryimage"},"thumbnailUrl":"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png","datePublished":"2015-03-18T09:38:23+00:00","dateModified":"2025-04-01T11:21:13+00:00","author":{"@id":"https:\/\/ivyproschool.com\/blog\/#\/schema\/person\/31fdab8559dd3db99173764bfb60215d"},"description":"Discover how data dredging, a method of finding patterns in large datasets without predefined hypotheses, is impacting analytics across industries.","breadcrumb":{"@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#primaryimage","url":"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png","contentUrl":"https:\/\/ivyproschool.com\/blog\/wp-content\/uploads\/2015\/03\/Data-dredging.png","width":479,"height":302},{"@type":"BreadcrumbList","@id":"https:\/\/ivyproschool.com\/blog\/why-data-dredging-is-trending\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ivyproschool.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Why Data Dredging is trending"}]},{"@type":"WebSite","@id":"https:\/\/ivyproschool.com\/blog\/#website","url":"https:\/\/ivyproschool.com\/blog\/","name":"Ivy Professional School | Official Blog","description":"Confused between R and Python for your data science journey? Discover the key differences in data visualization, handling capabilities, speed, and ease of learning.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ivyproschool.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/ivyproschool.com\/blog\/#\/schema\/person\/31fdab8559dd3db99173764bfb60215d","name":"Ivy Professional School","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/866b09293f13d461b399bb9a40607e85623ede13d844f763bf665689cb0d1452?s=96&d=mm&r=g","caption":"Ivy Professional School"},"sameAs":["http:\/\/www.ivyproschool.com","https:\/\/facebook.com\/ivyproschool","https:\/\/instagram.com\/ivyproschool","https:\/\/x.com\/ivyproschool","https:\/\/youtube.com\/ivyproschool"],"url":"https:\/\/ivyproschool.com\/blog\/author\/prateek\/"}]}},"_links":{"self":[{"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/posts\/5897","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/comments?post=5897"}],"version-history":[{"count":1,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/posts\/5897\/revisions"}],"predecessor-version":[{"id":12805,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/posts\/5897\/revisions\/12805"}],"wp:attachment":[{"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/media?parent=5897"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/categories?post=5897"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ivyproschool.com\/blog\/wp-json\/wp\/v2\/tags?post=5897"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}