{"id":8498,"date":"2022-03-16T04:01:56","date_gmt":"2022-03-16T04:01:56","guid":{"rendered":"https:\/\/wealthrevelation.com\/data-science\/2022\/03\/16\/deep-autoregressive-models\/"},"modified":"2022-03-16T04:01:56","modified_gmt":"2022-03-16T04:01:56","slug":"deep-autoregressive-models","status":"publish","type":"post","link":"https:\/\/wealthrevelation.com\/data-science\/2022\/03\/16\/deep-autoregressive-models\/","title":{"rendered":"Deep Autoregressive Models"},"content":{"rendered":"<div>\n<p><span dir=\"ltr\" role=\"presentation\">In<\/span> <span dir=\"ltr\" role=\"presentation\">this<\/span> <span dir=\"ltr\" role=\"presentation\">blog article,<\/span> <span dir=\"ltr\" role=\"presentation\">we<\/span> <span dir=\"ltr\" role=\"presentation\">will<\/span> <span dir=\"ltr\" role=\"presentation\">discuss<\/span> <span dir=\"ltr\" role=\"presentation\">about<\/span> <span dir=\"ltr\" role=\"presentation\">deep<\/span> <span dir=\"ltr\" role=\"presentation\">autoregressive<\/span> <span dir=\"ltr\" role=\"presentation\">generative<\/span> <span dir=\"ltr\" role=\"presentation\">models<\/span> <span dir=\"ltr\" role=\"presentation\">(AGM).<\/span> <span dir=\"ltr\" role=\"presentation\">Autoregressive<\/span> <span dir=\"ltr\" role=\"presentation\">models were originated from economics and social science literature on time-series data where obser- <\/span><span dir=\"ltr\" role=\"presentation\">vations from the previous steps are used to predict the value at the current and at future time steps <\/span><span dir=\"ltr\" role=\"presentation\">[SS05].<\/span> <span dir=\"ltr\" role=\"presentation\">Autoregression models can be expressed as: <br \/><\/span><\/p>\n<p class=\"ql-center-displayed-equation\"><span class=\"ql-right-eqno\"> \u00a0 <\/span><span class=\"ql-left-eqno\"> \u00a0 <\/span><img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-724f124d0252d960e241a60f1089f8ba_l3.png\" height=\"52\" width=\"173\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"begin{equation*} x_{t+1}= sum_i^t alpha_i x_{t-i} + c_i, end{equation*}\" title=\"Rendered by QuickLaTeX.com\"><\/p>\n<p>where the terms <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-8f0b6b1a01f8fcc2f95be0364c090397_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"alpha\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> and <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-41a04eeea923a1a0c28094a8a4680525_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"c\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"8\"> are constants to define the contributions of previous samples <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-c8700e0258243116de0d4f288e2e3b44_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"x_i\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"15\"> for the future value prediction. In the other words, autoregressive deep generative models are directed and fully observed models where outcome of the data completely depends on the previous data points as shown in Figure 1.<\/p>\n<div id=\"attachment_5968\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-5968\" loading=\"lazy\" class=\" wp-image-5968\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2022\/03\/autoregressive-directed-graph.png\" alt=\"Autoregressive directed graph.\" width=\"473\" height=\"141\"><\/p>\n<p id=\"caption-attachment-5968\" class=\"wp-caption-text\">Figure 1: Autoregressive directed graph.<\/p>\n<\/div>\n<p>Let\u2019s consider <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-1b5f6890d4dc0ab286366a25a12c8044_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"x sim X\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"50\">, where <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"X\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\"> is a set of images and each images is <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-ef45e6a52290b873bd130cf45aebb9fa_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"n-\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"24\">dimensional (n pixels). Then the prediction of new data pixel will be depending all the previously predicted pixels (Figure ?? shows the one row of pixels from an image). Referring to our last blog, deep generative models (DGMs) aim to learn the data distribution <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-9887fe5ee376b82e16a89b295d7d0097_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"p_theta(x)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"41\"> of the given training data and by following the chain rule of the probability, we can express it as:<\/p>\n<p class=\"ql-center-displayed-equation\"><span class=\"ql-right-eqno\"> (1) <\/span><span class=\"ql-left-eqno\"> \u00a0 <\/span><img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-914180213a67b6f604cd44f6e8c0cee7_l3.png\" height=\"49\" width=\"257\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"begin{equation*} p_theta(x) = prod_{i=1}^n p_theta(x_i | x_1, x_2, dots , x_{i-1}) end{equation*}\" title=\"Rendered by QuickLaTeX.com\"><\/p>\n<p>The above equation modeling the data distribution explicitly based on the pixel conditionals, which are tractable (exact likelihood estimation). The right hand side of the above equation is a complex distribution and can be represented by any possible distribution of <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-b170995d512c659d8668b4e42e1fef6b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"n\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"11\"> random variables. On the other hand, these kind of representation can have exponential space complexity. Therefore, in autoregressive generative models (AGM), these conditionals are approximated\/parameterized by neural networks.<\/p>\n<h2>Training<\/h2>\n<p>As AGMs are based on tractable likelihood estimation, during the training process these methods maximize the likelihood of images over the given training data <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"X\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\"> and it can be expressed as:<\/p>\n<p class=\"ql-center-displayed-equation\"><span class=\"ql-right-eqno\"> (2) <\/span><span class=\"ql-left-eqno\"> \u00a0 <\/span><img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-40ed969fa732b1c69b8d8860401ca8f1_l3.png\" height=\"49\" width=\"453\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"begin{equation*} max_{theta} sum_{xsim X} log : p_theta (x) = max_{theta} sum_{xsim X} sum_{i=1}^n log : p_theta (x_i | x_1, x_2, dots, x_{i-1}) end{equation*}\" title=\"Rendered by QuickLaTeX.com\"><\/p>\n<p>The above expression is appearing because of the fact that DGMs try to minimize the distance between the distribution of the training data and the distribution of the generated data (please refer to our last blog). The distance between two distribution can be computed using KL-divergence:<\/p>\n<p class=\"ql-center-displayed-equation\"><span class=\"ql-right-eqno\"> (3) <\/span><span class=\"ql-left-eqno\"> \u00a0 <\/span><img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-2f34a5e1bfb6b51327c81cf7d461c806_l3.png\" height=\"26\" width=\"347\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"begin{equation*} min_{theta} d_{KL}(p_d (x),p_theta (x)) = log: p_d(x) - log : p_theta(x) end{equation*}\" title=\"Rendered by QuickLaTeX.com\"><\/p>\n<p>In the above equation the term <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-90ea057b0e9df753503462afa6672c63_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"p_d(x)\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"41\"> does not depend on <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-356a08e839ab6974a16448e16e56745d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"theta\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\">, therefore, whole equation can be shortened to Equation <a href=\"#id815024633\">2<\/a>, which represents the MLE (maximum likelihood estimation) objective to learn the model parameter <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-356a08e839ab6974a16448e16e56745d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"theta\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"9\"> by maximizing the log likelihood of the training images <img loading=\"lazy\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/ql-cache\/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"X\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\">. From implementation point of view, the MLE objective can be optimized using the variations of stochastic gradient (ADAM, RMSProp, etc.) on mini-batches.<\/p>\n<h2>Network Architectures<\/h2>\n<p>As we are discussing deep generative models, here, we would like to discuss the deep aspect of AGMs. The parameterization of the conditionals mentioned in Equation <a href=\"#id3757209101\">1<\/a> can be realized by different kind of network architectures. In the literature, several network architectures are proposed to increase their receptive fields and memory, allowing more complex distributions to be learned. Here, we are mentioning a couple of well known architectures, which are widely used in deep AGMs:<\/p>\n<ol>\n<li><strong>Fully-visible sigmoid belief network (FVSBN):<\/strong> FVSBN is the simplest network without any hidden units and it is a linear combination of the input elements followed by a sigmoid function to keep output between 0 and 1. The positive aspects of this network is simple design and the total number of parameters in the model is quadratic which is much smaller compared to exponential [GHCC15].<\/li>\n<li><strong>Neural autoregressive density estimator (NADE):<\/strong> To increase the effectiveness of FVSBN, the simplest idea would be to use one hidden layer neural network instead of logistic regression. NADE is an alternate MLP-based parameterization and more effective compared to FVSBN [LM11].<\/li>\n<li><strong>Masked autoencoder density distribution (MADE):<\/strong> Here, the standard autoencoder neural networks are modified such that it works as an efficient generative models. MADE masks the parameters to follow the autoregressive property, where the current sample is reconstructed using previous samples in a given ordering [GGML15].<\/li>\n<li><strong>PixelRNN\/PixelCNN:<\/strong> These architecture are introducced by Google Deepmind in 2016 and utilizing the sequential property of the AGMs with recurrent and convolutional neural networks.<\/li>\n<\/ol>\n<div id=\"attachment_5967\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2022\/03\/different-autoregressive-architectures.png\" target=\"_blank\" rel=\"noopener\"><img aria-describedby=\"caption-attachment-5967\" loading=\"lazy\" class=\"wp-image-5967 \" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2022\/03\/different-autoregressive-architectures.png\" alt=\"Different autoregressive architectures\" width=\"568\" height=\"270\"><\/a><\/p>\n<p id=\"caption-attachment-5967\" class=\"wp-caption-text\">Figure 2: Different autoregressive architectures (image source from [LM11]).<\/p>\n<\/div>\n<div id=\"attachment_5966\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-5966\" loading=\"lazy\" class=\" wp-image-5966\" src=\"https:\/\/data-science-blog.com\/en\/wp-content\/uploads\/sites\/4\/2022\/03\/results-of-different-autoregressive-architectures.png\" alt=\"Results using different architectures\" width=\"585\" height=\"221\"><\/p>\n<p id=\"caption-attachment-5966\" class=\"wp-caption-text\">Results using different architectures (images source <a href=\"https:\/\/deepgenerativemodels.github.io\/assets\/slides\/cs236_lecture3.pdf\">https:\/\/deepgenerativemodels.github.io<\/a>).<\/p>\n<\/div>\n<p>It uses two different RNN architectures (Unidirectional LSTM and Bidirectional LSTM) to generate pixels horizontally and horizontally-vertically respectively. Furthermore, it ulizes residual connection to speed up the convergence and masked convolution to condition the different channels of images. PixelCNN applies several convolutional layers to preserve spatial resolution and increase the receptive fields. Furthermore, masking is applied to use only the previous pixels. PixelCNN is faster in training compared to PixelRNN. However, the outcome quality is better with PixelRNN [vdOKK16].<\/p>\n<h2>Summary<\/h2>\n<p>In this blog article, we discussed about deep autoregressive models in details with the mathematical foundation. Furthermore, we discussed about the training procedure including the summary of different network architectures. We did not discuss network architectures in details, we would continue the discussion of PixelCNN and its variations in upcoming blogs.<\/p>\n<h2><span dir=\"ltr\" role=\"presentation\">References<\/span><\/h2>\n<p><strong><span dir=\"ltr\" role=\"presentation\">[GGML15]<\/span><\/strong> <span dir=\"ltr\" role=\"presentation\">Mathieu<\/span> <span dir=\"ltr\" role=\"presentation\">Germain,<\/span> <span dir=\"ltr\" role=\"presentation\">Karol<\/span> <span dir=\"ltr\" role=\"presentation\">Gregor,<\/span> <span dir=\"ltr\" role=\"presentation\">Iain<\/span> <span dir=\"ltr\" role=\"presentation\">Murray,<\/span> <span dir=\"ltr\" role=\"presentation\">and<\/span> <span dir=\"ltr\" role=\"presentation\">Hugo<\/span> <span dir=\"ltr\" role=\"presentation\">Larochelle.<\/span> <span dir=\"ltr\" role=\"presentation\">MADE:<\/span> <span dir=\"ltr\" role=\"presentation\">masked <\/span><span dir=\"ltr\" role=\"presentation\">autoencoder for distribution estimation.<\/span> <span dir=\"ltr\" role=\"presentation\">CoRR<\/span><span dir=\"ltr\" role=\"presentation\">, abs\/1502.03509, 2015.<\/span><\/p>\n<p><strong><span dir=\"ltr\" role=\"presentation\">[GHCC15]<\/span> <\/strong><span dir=\"ltr\" role=\"presentation\">Zhe<\/span> <span dir=\"ltr\" role=\"presentation\">Gan,<\/span> <span dir=\"ltr\" role=\"presentation\">Ricardo<\/span> <span dir=\"ltr\" role=\"presentation\">Henao,<\/span> <span dir=\"ltr\" role=\"presentation\">David<\/span> <span dir=\"ltr\" role=\"presentation\">Carlson,<\/span> <span dir=\"ltr\" role=\"presentation\">and<\/span> <span dir=\"ltr\" role=\"presentation\">Lawrence<\/span> <span dir=\"ltr\" role=\"presentation\">Carin.<\/span> <span dir=\"ltr\" role=\"presentation\">Learning<\/span> <span dir=\"ltr\" role=\"presentation\">Deep<\/span> <span dir=\"ltr\" role=\"presentation\">Sigmoid <\/span><span dir=\"ltr\" role=\"presentation\">Belief Networks with Data Augmentation.<\/span> <span dir=\"ltr\" role=\"presentation\">In Guy Lebanon and S. V. N. Vishwanathan, <\/span><span dir=\"ltr\" role=\"presentation\">editors,<\/span> <span dir=\"ltr\" role=\"presentation\">Proceedings<\/span> <span dir=\"ltr\" role=\"presentation\">of<\/span> <span dir=\"ltr\" role=\"presentation\">the<\/span> <span dir=\"ltr\" role=\"presentation\">Eighteenth<\/span> <span dir=\"ltr\" role=\"presentation\">International<\/span> <span dir=\"ltr\" role=\"presentation\">Conference<\/span> <span dir=\"ltr\" role=\"presentation\">on<\/span> <span dir=\"ltr\" role=\"presentation\">Artificial<\/span> <span dir=\"ltr\" role=\"presentation\">Intelligence<\/span><br role=\"presentation\"><span dir=\"ltr\" role=\"presentation\">and<\/span> <span dir=\"ltr\" role=\"presentation\">Statistics<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <span dir=\"ltr\" role=\"presentation\">volume<\/span> <span dir=\"ltr\" role=\"presentation\">38<\/span> <span dir=\"ltr\" role=\"presentation\">of<\/span> <span dir=\"ltr\" role=\"presentation\">Proceedings<\/span> <span dir=\"ltr\" role=\"presentation\">of<\/span> <span dir=\"ltr\" role=\"presentation\">Machine<\/span> <span dir=\"ltr\" role=\"presentation\">Learning<\/span> <span dir=\"ltr\" role=\"presentation\">Research<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <span dir=\"ltr\" role=\"presentation\">pages<\/span> <span dir=\"ltr\" role=\"presentation\">268\u2013276, <\/span><span dir=\"ltr\" role=\"presentation\">San Diego, California, USA, 09\u201312 May 2015. PMLR.<\/span><\/p>\n<p><strong><span dir=\"ltr\" role=\"presentation\">[LM11]<\/span> <\/strong><span dir=\"ltr\" role=\"presentation\">Hugo Larochelle and Iain Murray. The neural autoregressive distribution estimator. In Ge<\/span><span dir=\"ltr\" role=\"presentation\">offrey Gordon, David Dunson, and Miroslav Dud\u00edk, editors,<\/span> <span dir=\"ltr\" role=\"presentation\">Proceedings of the Fourteenth <\/span><span dir=\"ltr\" role=\"presentation\">International Conference on Artificial Intelligence and Statistics<\/span><span dir=\"ltr\" role=\"presentation\">, volume 15 of<\/span> <span dir=\"ltr\" role=\"presentation\">Proceedings <\/span><span dir=\"ltr\" role=\"presentation\">of<\/span> <span dir=\"ltr\" role=\"presentation\">Machine<\/span> <span dir=\"ltr\" role=\"presentation\">Learning<\/span> <span dir=\"ltr\" role=\"presentation\">Research<\/span><span dir=\"ltr\" role=\"presentation\">, pages 29\u201337, Fort Lauderdale, FL, USA, 11\u201313 Apr 2011.<\/span><br role=\"presentation\"><span dir=\"ltr\" role=\"presentation\">PMLR.<\/span><\/p>\n<p><strong><span dir=\"ltr\" role=\"presentation\">[SS05]<\/span> <\/strong><span dir=\"ltr\" role=\"presentation\">Robert<\/span> <span dir=\"ltr\" role=\"presentation\">H.<\/span> <span dir=\"ltr\" role=\"presentation\">Shumway<\/span> <span dir=\"ltr\" role=\"presentation\">and<\/span> <span dir=\"ltr\" role=\"presentation\">David<\/span> <span dir=\"ltr\" role=\"presentation\">S.<\/span> <span dir=\"ltr\" role=\"presentation\">Stoffer.<\/span> <span dir=\"ltr\" role=\"presentation\">Time<\/span> <span dir=\"ltr\" role=\"presentation\">Series<\/span> <span dir=\"ltr\" role=\"presentation\">Analysis<\/span> <span dir=\"ltr\" role=\"presentation\">and<\/span> <span dir=\"ltr\" role=\"presentation\">Its<\/span> <span dir=\"ltr\" role=\"presentation\">Applications <\/span><span dir=\"ltr\" role=\"presentation\">(Springer<\/span> <span dir=\"ltr\" role=\"presentation\">Texts<\/span> <span dir=\"ltr\" role=\"presentation\">in<\/span> <span dir=\"ltr\" role=\"presentation\">Statistics)<\/span><span dir=\"ltr\" role=\"presentation\">.<\/span> <span dir=\"ltr\" role=\"presentation\">Springer-Verlag, Berlin, Heidelberg, 2005.<\/span><\/p>\n<p><strong><span dir=\"ltr\" role=\"presentation\">[vdOKK16]<\/span><\/strong> <span dir=\"ltr\" role=\"presentation\">A \u0308aron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu.<\/span> <span dir=\"ltr\" role=\"presentation\">Pixel recurrent neural<\/span><br role=\"presentation\"><span dir=\"ltr\" role=\"presentation\">networks.<\/span> <span dir=\"ltr\" role=\"presentation\">CoRR<\/span><span dir=\"ltr\" role=\"presentation\">, abs\/1601.06759, 2016<\/span><\/p>\n<div id=\"author-bio-box\">\n<h3><a href=\"https:\/\/data-science-blog.com\/en\/blog\/author\/sunilyadav\/\" title=\"All posts by Sunil Yadav\" rel=\"author\">Sunil Yadav<\/a><\/h3>\n<div class=\"bio-gravatar\"><img alt=\"\" src=\"https:\/\/secure.gravatar.com\/avatar\/970316518f70bdde11d4df09cf262eb4?s=70&amp;d=mm&amp;r=g\" class=\"avatar avatar-70 photo\" height=\"70\" width=\"70\" loading=\"lazy\"><\/div>\n<p><a target=\"_blank\" rel=\"nofollow noopener noreferrer\" href=\"https:\/\/www.linkedin.com\/in\/sunil-yadav-9804315a?miniProfileUrn=urnlifs_miniProfileACoAAAyFjzABioeldpBaEuO5tOpf0utOM-kzYdk&amp;lipi=urnlipaged_flagship3_search_srp_all4iVlVv0TnKvm3Q9n4xIA\" class=\"bio-icon bio-icon-linkedin\"><\/a><\/p>\n<p class=\"bio-description\">Sunil Yadav is an experienced researcher with a keen focus on applying academic research to solve real-world problems. He believes a research paper has more value if it can be used for the welfare of society in general and the wellness of people in particular. He finished his PhD in mathematics and computer science and has a focus on computer vision, 3D data modelling, and medical imaging. His research interests revolve around understanding the visual data and producing meaningful output using the different areas of mathematics, including Deep learning, Machine learning, and computer vision.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/data-science-blog.com\/en\/blog\/2022\/03\/15\/deep-autoregressive-models\/<\/p>\n","protected":false},"author":0,"featured_media":8499,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8498"}],"collection":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/comments?post=8498"}],"version-history":[{"count":0,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/posts\/8498\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media\/8499"}],"wp:attachment":[{"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/media?parent=8498"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/categories?post=8498"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wealthrevelation.com\/data-science\/wp-json\/wp\/v2\/tags?post=8498"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}