{"id":8042,"date":"2020-12-29T00:26:48","date_gmt":"2020-12-29T00:26:48","guid":{"rendered":"https:\/\/healinglifespan.com\/data-science\/2020\/12\/29\/monte-carlo-integration-in-python\/"},"modified":"2020-12-29T00:26:48","modified_gmt":"2020-12-29T00:26:48","slug":"monte-carlo-integration-in-python","status":"publish","type":"post","link":"https:\/\/wealthrevelation.com\/data-science\/2020\/12\/29\/monte-carlo-integration-in-python\/","title":{"rendered":"Monte Carlo integration in Python"},"content":{"rendered":"<div id=\"post-\">\n<div><img src=\"https:\/\/miro.medium.com\/max\/715\/1*LXdL8wBSxUsO1Cm9gUxOeA.png\" alt=\"Figure\" width=\"100%\"><br \/><span><\/p>\n<p>Image source: Wikipedia(Free) and collage made by the author<\/p>\n<p><\/span><\/div>\n<p>\u00a0<\/p>\n<p><strong>Disclaimer<\/strong>: The inspiration for this article stemmed from\u00a0<a href=\"https:\/\/www.gatech.edu\/academics\/degrees\/masters\/analytics-online-degree-oms-analytics\" rel=\"noopener noreferrer\" target=\"_blank\"><strong>Georgia Tech\u2019s Online Masters in Analytics (OMSA)<\/strong><\/a><strong>\u00a0<\/strong>program study material. I am proud to pursue this excellent Online MS program. You can also\u00a0<a href=\"http:\/\/catalog.gatech.edu\/programs\/analytics-ms\/#onlinetext\" rel=\"noopener noreferrer\" target=\"_blank\">check the details here.<\/a><\/p>\n<p>\u00a0<\/p>\n<h3>What is Monte Carlo integration?<\/h3>\n<p>\u00a0<br \/>Monte Carlo, is in fact, the name of the world-famous casino located in the eponymous district of the city-state (also called a Principality) of Monaco, on the world-famous French Riviera.<\/p>\n<p>It turns out that the casino inspired the minds of famous scientists to devise an intriguing mathematical technique for solving complex problems in statistics, numerical computing, system simulation.<\/p>\n<div><img src=\"https:\/\/miro.medium.com\/max\/1440\/1*2CHmFn9Exx4CbxITikcyjg.jpeg\" alt=\"Figure\" width=\"100%\"><br \/><span><\/p>\n<p><\/span><\/div>\n<p>\u00a0<\/p>\n<p>One of\u00a0the\u00a0<a href=\"https:\/\/www.solver.com\/press\/monte-carlo-methods-led-atomic-bomb-may-be-your-best-bet-business-decision-making\" rel=\"noopener noreferrer\" target=\"_blank\">first and most famous uses of this technique was during the Manhattan Project<\/a>\u00a0when the chain-reaction dynamics in highly enriched uranium presented an unimaginably complex theoretical calculation to the scientists. Even the genius minds like John Von Neumann, Stanislaw Ulam, Nicholas Metropolis could not tackle it in the traditional way. They, therefore, turned to the wonderful world of random numbers and let these probabilistic quantities tame the originally intractable calculations.<\/p>\n<div><img src=\"https:\/\/miro.medium.com\/max\/425\/1*OB_Eq67wTLKTct52mfPh_w.jpeg\" alt=\"Figure\" width=\"100%\"><br \/><span><\/p>\n<p><\/span><\/div>\n<p>\u00a0<\/p>\n<p>Amazingly, these random variables could solve the computing problem, which stymied the sure-footed deterministic approach. The elements of uncertainty actually won.<\/p>\n<p>Just like\u00a0<strong>uncertainty and randomness rule in the world of Monte Carlo games<\/strong>. That was the inspiration for this particular moniker.<\/p>\n<div><img src=\"https:\/\/miro.medium.com\/max\/480\/1*pfKrXDVZvabFK3L4ccgyrA.jpeg\" alt=\"Figure\" width=\"100%\"><br \/><span><\/p>\n<p><\/span><\/div>\n<p>\u00a0<\/p>\n<p>Today, it is a technique used in a wide swath of fields \u2014<\/p>\n<ul>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Monte_Carlo_methods_in_finance\" rel=\"noopener noreferrer\" target=\"_blank\">risk analysis, financial engineering<\/a>,\n<\/li>\n<li>supply chain logistics,\n<\/li>\n<li><a href=\"https:\/\/www.beckershospitalreview.com\/healthcare-information-technology\/a-million-trials-in-5-minutes-how-monte-carlo-simulations-could-revolutionize-healthcare.html\" rel=\"noopener noreferrer\" target=\"_blank\">healthcare research<\/a>, drug development\n<\/li>\n<li>statistical learning and modeling,\n<\/li>\n<li><a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/978-3-540-74496-2_8\" rel=\"noopener noreferrer\" target=\"_blank\">computer graphics<\/a>, image processing,\u00a0<a href=\"https:\/\/beej.us\/blog\/data\/monte-carlo-method-game-ai\/\" rel=\"noopener noreferrer\" target=\"_blank\">game design<\/a>,\n<\/li>\n<li>large system simulations,\n<\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Monte_Carlo_method_in_statistical_physics\" rel=\"noopener noreferrer\" target=\"_blank\">computational physics<\/a>, astronomy, etc.\n<\/li>\n<\/ul>\n<p>For all its successes and fame, the basic idea is deceptively simple and easy to demonstrate. We demonstrate it in this article with a simple set of Python code.<\/p>\n<blockquote>\n<p>\nOne of the first and most famous uses of this technique was during the Manhattan Project\n<\/p>\n<\/blockquote>\n<p>\u00a0<\/p>\n<h3>The idea<\/h3>\n<p>\u00a0<\/p>\n<h3>A tricky integral<\/h3>\n<p>\u00a0<br \/>While the general Monte Carlo simulation technique is much broader in scope, we focus particularly on the Monte Carlo integration technique here.<\/p>\n<p>It is nothing but a numerical method for computing complex definite integrals, which lack closed-form analytical solutions.<\/p>\n<p>Say, we want to calculate,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/466\/1*diraOxH15-bSGuGCm_k1ag.gif\" width=\"60%\"><\/p>\n<p>It\u2019s not easy or downright impossible to get a closed-form solution for this integral in the indefinite form. But\u00a0<strong>numerical approximation can always give us the definite integral as a sum<\/strong>.<\/p>\n<p>Here is the plot of the function.<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/561\/1*PCRO3fc5_5oFw32qUqClYg.png\" width=\"100%\"><\/p>\n<p>\u00a0<\/p>\n<h3>The Riemann sum<\/h3>\n<p>\u00a0<br \/>There are many such techniques under the general category of\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Riemann_sum\" rel=\"noopener noreferrer\" target=\"_blank\"><strong>Riemann sum<\/strong><\/a>. The idea is just to divide the area under the curve into small rectangular or trapezoidal pieces, approximate them by the simple geometrical calculations, and sum those components up.<\/p>\n<p>For a simple illustration, I show such a scheme with only 5 equispaced intervals.<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/559\/1*fNj71EIe38FDwinRpEzMYQ.png\" width=\"100%\"><\/p>\n<p>For the programmer friends, in fact, there is a ready-made\u00a0<a href=\"https:\/\/docs.scipy.org\/doc\/scipy\/reference\/generated\/scipy.integrate.quad.html#scipy.integrate.quad\" rel=\"noopener noreferrer\" target=\"_blank\">function in the Scipy package<\/a>\u00a0which can do this computation fast and accurately.<\/p>\n<p>\u00a0<\/p>\n<h3>What if I go random?<\/h3>\n<p>\u00a0<br \/>What if I told you that I do not need to pick the intervals so uniformly, and, in fact, I can go completely probabilistic, and pick 100% random intervals to compute the same integral?<\/p>\n<p>Crazy talk? My choice of samples could look like this\u2026<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/562\/1*_4R6U_gE5IPw2tI77LkkmQ.png\" width=\"100%\"><\/p>\n<p>Or, this\u2026<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/569\/1*QmtjbwAPAEI4doB4UUDr1Q.png\" width=\"100%\"><\/p>\n<p>We don\u2019t have the time or scope to prove the theory behind it, but it can be shown that\u00a0<strong>with a reasonably high number of random sampling, we can, in fact, compute the integral with sufficiently high accuracy<\/strong>!<\/p>\n<p>We just choose random numbers (between the limits), evaluate the function at those points, add them up, and scale it by a known factor. We are done.<\/p>\n<p>OK. What are we waiting for? Let\u2019s demonstrate this claim with some simple Python code.<\/p>\n<blockquote>\n<p>\nFor all its successes and fame, the basic idea is deceptively simple and easy to demonstrate.\n<\/p>\n<\/blockquote>\n<p>\u00a0<\/p>\n<h3>The Python code<\/h3>\n<p>\u00a0<\/p>\n<h3>Replace complex math by simple averaging<\/h3>\n<p>\u00a0<br \/>If we are trying to calculate an integral \u2014 any integral \u2014 of the form below,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/267\/1*hAq_zL6SMDACZF-ccLTi4A.png\" width=\"50%\"><\/p>\n<p>we just replace the \u2018estimate\u2019 of the integral by the following average,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/698\/1*LiGwvZg3idDnlHtet0-1Ig.png\" width=\"100%\"><\/p>\n<p>where the\u00a0<strong><em>U<\/em><\/strong>\u2019s represent uniform random numbers between 0 and 1. Note, how we\u00a0<strong>replace the complex integration process by simply adding up a bunch of numbers and taking their average<\/strong>!<\/p>\n<p>In any modern computing system, programming language, or even commercial software packages like Excel, you have access to this uniform random number generator. Check out my article on this topic,<\/p>\n<p>\u00a0<br \/><a href=\"https:\/\/towardsdatascience.com\/how-to-generate-random-variables-from-scratch-no-library-used-4b71eb3c8dc7\" rel=\"noopener noreferrer\" target=\"_blank\"><b>How to generate random variables from scratch (no library used)<\/b><\/a><br \/>We go through a simple pseudo-random generator algorithm and show how to use that for generating important random\u2026<br \/>\u00a0<\/p>\n<blockquote>\n<p>\nWe just choose random numbers (between the limits), evaluate the function at those points, add them up, and scale it by a known factor. We are done.\n<\/p>\n<\/blockquote>\n<p>\u00a0<\/p>\n<h3>The function<\/h3>\n<p>\u00a0<br \/>Here is a Python function, which accepts another function as the first argument, two limits of integration, and an optional integer to compute the definite integral represented by the argument function.<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/661\/1*Eqzhdon85xF8Qp58Abg40Q.png\" width=\"100%\"><\/p>\n<p>The code may look slightly different than the equation above (or another version that you might have seen in a textbook). That is because\u00a0<strong>I am making the computation more accurate by distributing random samples over 10 intervals<\/strong>.<\/p>\n<p>For our specific example, the argument function looks like,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/516\/1*N3lJkcjHk93O1-H7Az61Mw.png\" width=\"100%\"><\/p>\n<p>And we can compute the integral by simply passing this to the\u00a0<code>monte_carlo_uniform()<\/code>\u00a0function,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/377\/1*Hhgz4sPoJbkQVX5zxwyCZA.png\" width=\"100%\"><\/p>\n<p>Here, as you can see, we have taken 100 random samples between the integration limits\u00a0<strong><em>a<\/em><\/strong>\u00a0= 0 and\u00a0<strong><em>b<\/em><\/strong>\u00a0= 4.<\/p>\n<p>\u00a0<\/p>\n<h3>How good is the calculation anyway?<\/h3>\n<p>\u00a0<br \/>This integral cannot be calculated analytically. So, we need to benchmark the accuracy of the Monte Carlo method against another numerical integration technique anyway. We chose the Scipy\u00a0<code>integrate.quad()<\/code>function for that.<br \/>Now, you may also be thinking \u2014\u00a0<strong>what happens to the accuracy as the sampling density changes<\/strong>. This choice clearly impacts the computation speed \u2014 we need to add less number of quantities if we choose a reduced sampling density.<\/p>\n<p>Therefore, we simulated the same integral for a range of sampling density and plotted the result on top of the gold standard \u2014 the Scipy function represented as the horizontal line in the plot below,<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/593\/1*VVDc8dGx_3vXjtQmex6qgQ.png\" width=\"100%\"><\/p>\n<p>Therefore, we observe some small perturbations in the low sample density phase, but they smooth out nicely as the sample density increases. In any case, the absolute error is extremely small compared to the value returned by the Scipy function \u2014 on the order of 0.02%.<\/p>\n<p>The Monte Carlo trick works fantastically!<\/p>\n<div><img src=\"https:\/\/miro.medium.com\/max\/480\/1*ZHGz_uHkasK7dU2Ik4iOIw.jpeg\" alt=\"Figure\" width=\"100%\"><br \/><span><\/p>\n<p><\/span><\/div>\n<p>\u00a0<\/p>\n<h3>What about the speed?<\/h3>\n<p>\u00a0<br \/>But is it as fast as the Scipy method? Better? Worse?<\/p>\n<p>We try to find out by running 100 loops of 100 runs (10,000 runs in total) and obtaining the summary statistics.<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/615\/1*xNSqq2y-eggX5pUwbWddAQ.png\" width=\"100%\"><\/p>\n<p>In this particular example, the Monte Carlo calculations are running twice as fast as the Scipy integration method!<\/p>\n<p>While this kind of speed advantage depends on many factors, we can be assured that the\u00a0<strong>Monte Carlo technique is not a slouch when it comes to the matter of computation efficiency<\/strong>.<\/p>\n<blockquote>\n<p>\nwe observe some small perturbations in the low sample density phase, but they smooth out nicely as the sample density increases\n<\/p>\n<\/blockquote>\n<p>\u00a0<\/p>\n<h3>Rinse, repeat, rinse, and repeat\u2026<\/h3>\n<p>\u00a0<br \/>For a probabilistic technique like Monte Carlo integration, it goes without saying that mathematicians and scientists almost never stop at just one run but repeat the calculations for a number of times and take the average.<\/p>\n<p>Here is a distribution plot from a 10,000 run experiment.<\/p>\n<p><img alt=\"Image for post\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/max\/573\/1*Z7WXMfQA6yv0aYAdg-s8HA.png\" width=\"100%\"><\/p>\n<p>As you can see, the plot almost resembles a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_distribution\" rel=\"noopener noreferrer\" target=\"_blank\">Gaussian Normal distribution<\/a>\u00a0and this fact can be utilized to not only get the average value but also construct\u00a0<a href=\"https:\/\/www.mathsisfun.com\/data\/confidence-interval.html\" rel=\"noopener noreferrer\" target=\"_blank\"><strong>confidence intervals<\/strong><\/a>\u00a0around that result.<\/p>\n<p><a href=\"https:\/\/www.mathsisfun.com\/data\/confidence-interval.html\" rel=\"noopener noreferrer\" target=\"_blank\"><b>Confidence Intervals<\/b><\/a><br \/>An interval of 4 plus or minus 2 A Confidence Interval is a range of values we are fairly sure our true value lies in\u2026<br \/>\u00a0<\/p>\n<p>\u00a0<\/p>\n<h3>Particularly suitable for high-dimensional integrals<\/h3>\n<p>\u00a0<br \/>Although for our simple illustration (and for pedagogical purpose), we stick to a single-variable integral, the same idea can easily be extended to high-dimensional integrals with multiple variables.<\/p>\n<p>And it is in this higher dimension that the Monte Carlo method particularly shines as compared to Riemann sum based approaches. The sample density can be optimized in a much more favorable manner for the Monte Carlo method to make it much faster without compromising the accuracy.<\/p>\n<p>In mathematical terms, the convergence rate of the method is independent of the number of dimensions. In machine learning speak, the\u00a0<strong>Monte Carlo method is the best friend you have to beat the curse of dimensionality when it comes to complex integral calculations<\/strong>.<\/p>\n<p>Read this article for a great introduction,<\/p>\n<p><a href=\"https:\/\/www.scratchapixel.com\/lessons\/mathematics-physics-for-computer-graphics\/monte-carlo-methods-in-practice\/monte-carlo-integration\" rel=\"noopener noreferrer\" target=\"_blank\"><b>Monte Carlo Methods in Practice (Monte Carlo Integration)<\/b><\/a><br \/>Monte Carlo Methods in Practice If you understand and know about the most important concepts of probability and\u2026<br \/>\u00a0<\/p>\n<blockquote>\n<p>\nAnd it is in this higher dimension that the Monte Carlo method particularly shines as compared to Riemann sum based approaches.\n<\/p>\n<\/blockquote>\n<p>\u00a0<\/p>\n<h3>Summary<\/h3>\n<p>\u00a0<br \/>We introduced the concept of Monte Carlo integration and illustrated how it differs from the conventional numerical integration methods. We also showed a simple set of Python codes to evaluate a one-dimensional function and assess the accuracy and speed of the techniques.<\/p>\n<p>The broader class of\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Monte_Carlo_method\" rel=\"noopener noreferrer\" target=\"_blank\">Monte Carlo simulation techniques<\/a>\u00a0is more exciting and is used in a ubiquitous manner in fields related to artificial intelligence, data science, and statistical modeling.<\/p>\n<p>For example, the famous Alpha Go program from DeepMind used a Monte Carlo search technique to be computationally efficient in the high-dimensional space of the game Go. Numerous such examples can be found in practice.<\/p>\n<p><a href=\"https:\/\/www.analyticsvidhya.com\/blog\/2019\/01\/monte-carlo-tree-search-introduction-algorithm-deepmind-alphago\/\" rel=\"noopener noreferrer\" target=\"_blank\"><b>Introduction to Monte Carlo Tree Search: The Game-Changing Algorithm behind DeepMind&#8217;s AlphaGo<\/b><\/a><br \/>A best-of five-game series, $1 million dollars in prize money &#8211; A high stakes shootout. Between 9 and 15 March, 2016\u2026<br \/>\u00a0<\/p>\n<h3>If you liked it\u2026<\/h3>\n<p>\u00a0<br \/>If you liked this article, you may also like my other articles on similar topics,<\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/how-to-generate-random-variables-from-scratch-no-library-used-4b71eb3c8dc7\" rel=\"noopener noreferrer\" target=\"_blank\"><b>How to generate random variables from scratch (no library used)<\/b><\/a><br \/>We go through a simple pseudo-random generator algorithm and show how to use that for generating important random\u2026<br \/>\u00a0<\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/mathematical-programming-a-key-habit-to-built-up-for-advancing-in-data-science-c6d5c29533be\" rel=\"noopener noreferrer\" target=\"_blank\"><b>Mathematical programming \u2014 a key habit to build up for advancing in data science<\/b><\/a><br \/>We show a small step towards building the habit of mathematical programming, a key skill in the repertoire of a budding\u2026<br \/>\u00a0<\/p>\n<p><a href=\"https:\/\/towardsdatascience.com\/brownian-motion-with-python-9083ebc46ff0\" rel=\"noopener noreferrer\" target=\"_blank\"><b>Brownian motion with Python<\/b><\/a><br \/>We show how to emulate Brownian motion, the most famous stochastic process used in a wide range of applications, using\u2026<br \/>\u00a0<\/p>\n<p>Also, you can check the author\u2019s\u00a0<a href=\"https:\/\/github.com\/tirthajyoti?tab=repositories\" rel=\"noopener noreferrer\" target=\"_blank\"><strong>GitHub<\/strong><\/a><strong>\u00a0repositories\u00a0<\/strong>for code, ideas, and resources in machine learning and data science. If you are, like me, passionate about AI\/machine learning\/data science, please feel free to\u00a0<a href=\"https:\/\/www.linkedin.com\/in\/tirthajyoti-sarkar-2127aa7\/\" rel=\"noopener noreferrer\" target=\"_blank\">add me on LinkedIn<\/a>\u00a0or\u00a0<a href=\"https:\/\/twitter.com\/tirthajyotiS\" rel=\"noopener noreferrer\" target=\"_blank\">follow me on Twitter<\/a>.<\/p>\n<p>\u00a0<br \/><a href=\"https:\/\/towardsdatascience.com\/monte-carlo-integration-in-python-a71a209d277e\" target=\"_blank\" rel=\"noopener noreferrer\">Original<\/a>. Reposted with permission.<\/p>\n<p><b>Related:<\/b><\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/www.kdnuggets.com\/2020\/12\/monte-carlo-integration-python.html<\/p>\n","protected":false},"author":0,"featured_media":8043,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8042"}],"collection":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/comments?post=8042"}],"version-history":[{"count":0,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8042\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media\/8043"}],"wp:attachment":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media?parent=8042"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/categories?post=8042"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/tags?post=8042"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}