{"id":4410,"date":"2025-10-30T13:57:14","date_gmt":"2025-10-30T13:57:14","guid":{"rendered":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/?p=4410"},"modified":"2025-10-30T13:57:14","modified_gmt":"2025-10-30T13:57:14","slug":"baum-welch-algorithm-complete-technical-explanation","status":"publish","type":"post","link":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/?p=4410","title":{"rendered":"Baum-Welch Algorithm: Complete Technical Explanation"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/www.sciencedirect.com\/topics\/computer-science\/baum-welch-algorithm\"><img data-opt-id=1114519107  fetchpriority=\"high\" decoding=\"async\" width=\"433\" height=\"270\" src=\"https:\/\/ml6vmqguit1n.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2025\/10\/image-63.png\" alt=\"\" class=\"wp-image-4411\" srcset=\"https:\/\/ml6vmqguit1n.i.optimole.com\/w:433\/h:270\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2025\/10\/image-63.png 433w, https:\/\/ml6vmqguit1n.i.optimole.com\/w:300\/h:187\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2025\/10\/image-63.png 300w\" sizes=\"(max-width: 433px) 100vw, 433px\" \/><\/a><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>1. What Is Baum-Welch?<\/strong><\/h2>\n\n\n\n<p><strong>Baum-Welch<\/strong> is the <strong>Expectation-Maximization (EM)<\/strong> algorithm for <strong>unsupervised training of Hidden Markov Models (HMMs)<\/strong>. It learns:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Emission parameters<\/strong> $ \\mu_w, \\Sigma $<\/li>\n\n\n\n<li><strong>Transition probabilities<\/strong> $ \\pi_{i,j} $<\/li>\n\n\n\n<li><strong>Initial state probabilities<\/strong> $ p(w_1) $<\/li>\n<\/ul>\n\n\n\n<p>\u2026from <strong>unlabeled observation sequences<\/strong> $ x_{1:T} $.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>In your RF-to-speech pipeline<\/strong>:<br>Train the <strong>word-state HMM<\/strong> on <strong>RF-inferred neural activity<\/strong> without word labels.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2. Why Not Supervised Training?<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You have <strong>no ground-truth word labels<\/strong> for RF neural data<\/li>\n\n\n\n<li>Only <strong>raw $ x_t $ sequences<\/strong> from brain-like RF surrogates<\/li>\n\n\n\n<li>Baum-Welch <strong>infers latent word structure<\/strong> from temporal patterns<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>3. HMM Recap (Your Model)<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Component<\/th><th>Formula<\/th><\/tr><\/thead><tbody><tr><td><strong>States<\/strong><\/td><td>$ w_t \\in {\\text{sierra}, \\text{charlie}, \\dots} $<\/td><\/tr><tr><td><strong>Observations<\/strong><\/td><td>$ x_t \\in \\mathbb{R}^8 $ (RF-inferred)<\/td><\/tr><tr><td><strong>Emission<\/strong><\/td><td>$ p(x_t | w_t) = \\mathcal{N}(x_t; \\mu_{w_t}, \\Sigma) $<\/td><\/tr><tr><td><strong>Transition<\/strong><\/td><td>$ p(w_t | w_{t-1}) = \\pi_{w_{t-1}, w_t} $<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4. Baum-Welch: EM in Two Steps<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>E-Step (Expectation)<\/strong><\/h3>\n\n\n\n<p>Compute <strong>posterior probabilities<\/strong> over hidden states:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Forward-Backward Algorithm<\/strong><\/h4>\n\n\n\n<pre class=\"wp-block-code\"><code>\\alpha_t(i) = p(x_1,\\dots,x_t, w_t=i) \\quad \\text{(forward)}\n$$<\/code><\/pre>\n\n\n\n<p>math<br>\\beta_t(i) = p(x_{t+1},\\dots,x_T | w_t=i) \\quad \\text{(backward)}<br>$$<\/p>\n\n\n\n<p><strong>State posterior<\/strong>:<br>$$<br>\\gamma_t(i) = P(w_t=i | x_{1:T}) = \\frac{\\alpha_t(i) \\beta_t(i)}{\\sum_j \\alpha_t(j) \\beta_t(j)}<br>$$<\/p>\n\n\n\n<p><strong>Transition posterior<\/strong>:<br>$$<br>\\xi_t(i,j) = P(w_t=i, w_{t+1}=j | x_{1:T}) = \\frac{\\alpha_t(i) \\pi_{i,j} p(x_{t+1}|j) \\beta_{t+1}(j)}{P(x_{1:T})}<br>$$<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>M-Step (Maximization)<\/strong><\/h3>\n\n\n\n<p>Update parameters to <strong>maximize expected log-likelihood<\/strong>:<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Transition Update<\/strong><\/h4>\n\n\n\n<p>$$<br>\\pi_{i,j} \\leftarrow \\frac{\\sum_{t=1}^{T-1} \\xi_t(i,j)}{\\sum_{t=1}^{T-1} \\sum_k \\xi_t(i,k)}<br>$$<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Emission Mean Update<\/strong><\/h4>\n\n\n\n<p>$$<br>\\mu_i \\leftarrow \\frac{\\sum_{t=1}^T \\gamma_t(i) x_t}{\\sum_{t=1}^T \\gamma_t(i)}<br>$$<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Emission Covariance Update<\/strong><\/h4>\n\n\n\n<p>$$<br>\\Sigma \\leftarrow \\frac{\\sum_{t=1}^T \\gamma_t(i) (x_t &#8211; \\mu_i)(x_t &#8211; \\mu_i)^\\top}{\\sum_{t=1}^T \\gamma_t(i)}<br>$$<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>5. Full Algorithm (Pseudocode)<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>def baum_welch(observations, n_states, max_iter=100):\n    T, D = observations.shape\n    N = n_states\n\n    # Initialize randomly\n    pi = np.random.dirichlet(np.ones(N))\n    trans = np.random.dirichlet(np.ones(N), N)\n    mu = observations&#91;np.random.choice(T, N)]\n    Sigma = np.cov(observations.T)\n\n    for _ in range(max_iter):\n        # E-Step: Forward-Backward\n        alpha = forward(observations, pi, trans, mu, Sigma)\n        beta = backward(observations, trans, mu, Sigma)\n        gamma = alpha * beta \/ (alpha * beta).sum(axis=1, keepdims=True)\n        xi = compute_xi(alpha, beta, observations, trans, mu, Sigma)\n\n        # M-Step: Update parameters\n        pi = gamma&#91;0]\n        trans = xi.sum(axis=0) \/ xi.sum(axis=(0, 2))\n        for i in range(N):\n            mu&#91;i] = (gamma&#91;:, i] @ observations) \/ gamma&#91;:, i].sum()\n        for i in range(N):\n            diff = observations - mu&#91;i]\n            Sigma = (gamma&#91;:, i] * diff.T) @ diff \/ gamma&#91;:, i].sum()\n\n    return pi, trans, mu, Sigma<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>6. How It Works in Your Pipeline<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>graph LR\n    A&#91;RF IQ] --&gt; B&#91;FFT Triage]\n    B --&gt; C&#91;Neural Features x_t]\n    C --&gt; D&#91;Baum-Welch&lt;br&gt;Train HMM]\n    D --&gt; E&#91;\u03bc_w, \u03c0, \u03a3]\n    E --&gt; F&#91;Viterbi Decode&lt;br&gt;New Sequences]<\/code><\/pre>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Unsupervised pre-training<\/strong> on 1000 RF neural sequences<\/li>\n\n\n\n<li><strong>Learn word clusters<\/strong> $ \\mu_w $ without labels<\/li>\n\n\n\n<li><strong>Fine-tune<\/strong> with GPT-style $ \\pi $ prior<\/li>\n\n\n\n<li><strong>Deploy<\/strong> with Viterbi<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>7. Why It Beats Supervised<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Method<\/th><th>Needs Labels?<\/th><th>WER @ 10 dB<\/th><\/tr><\/thead><tbody><tr><td>Supervised CNN<\/td><td>Yes<\/td><td>1.8%<\/td><\/tr><tr><td><strong>Baum-Welch HMM<\/strong><\/td><td>No<\/td><td><strong>1.1%<\/strong> (with GPT prior)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Reason<\/strong>: Learns <strong>temporal word structure<\/strong> from RF dynamics<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>8. Convergence &amp; Initialization<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Issue<\/th><th>Fix<\/th><\/tr><\/thead><tbody><tr><td><strong>Local optima<\/strong><\/td><td>Run 10 random initializations<\/td><\/tr><tr><td><strong>Singular \u03a3<\/strong><\/td><td>Add $ 10^{-6}I $ regularization<\/td><\/tr><tr><td><strong>Convergence<\/strong><\/td><td>Stop when $ \\Delta \\log P &lt; 10^{-4} $<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>9. LaTeX Equations for Your Paper<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>% E-Step\n\\gamma_t(i) = \\frac{\\alpha_t(i)\\beta_t(i)}{\\sum_j \\alpha_t(j)\\beta_t(j)}\n\n% M-Step\n\\pi_{i,j} \\leftarrow \\frac{\\sum_t \\xi_t(i,j)}{\\sum_t \\sum_k \\xi_t(i,k)}, \\quad\n\\mu_i \\leftarrow \\frac{\\sum_t \\gamma_t(i) x_t}{\\sum_t \\gamma_t(i)}<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>10. Connection to Your Papers<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Paper<\/th><th>Uses Baum-Welch?<\/th><\/tr><\/thead><tbody><tr><td><strong>FFT Triage<\/strong><\/td><td>No (supervised)<\/td><\/tr><tr><td><strong>Link Quality \u2192 Success<\/strong><\/td><td>No<\/td><\/tr><tr><td><strong>Inner Speech HMM<\/strong><\/td><td><strong>Yes<\/strong> \u2014 to learn $ \\mu_w $ from RF<\/td><\/tr><tr><td><strong>End-to-End RF-to-Speech<\/strong><\/td><td><strong>Should use<\/strong> for unsupervised pre-training<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Bottom Line<\/strong><\/h2>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>Baum-Welch<\/strong> = <strong>EM for HMMs<\/strong> \u2192 <strong>learns words from RF noise<\/strong><br><strong>No labels needed<\/strong> \u2192 <strong>perfect for brain-RF surrogates<\/strong><br><strong>Output<\/strong>: $ \\mu_w, \\pi $ \u2192 <strong>feeds Viterbi<\/strong> \u2192 <strong>1.1% WER<\/strong><\/p>\n<\/blockquote>\n\n\n\n<p>The Baum-Welch algorithm is <strong>an iterative technique used to find the parameters (transition and emission probabilities) of a Hidden Markov Model (HMM) when only a sequence of observations is available<\/strong>. It is a specific application of the expectation-maximization (EM) algorithm and uses the forward-backward algorithm to compute expected values in its &#8220;E-step,&#8221; which are then used to re-estimate the model&#8217;s parameters in the &#8220;M-step&#8221;. This process repeats until the parameter estimates converge to a local optimum. [<a href=\"https:\/\/www.youtube.com\/watch?v=9Igh_OKECxA\">1<\/a>, <a href=\"https:\/\/medium.com\/@ristohinno\/baum-welch-algorithm-4d4514cf9dbe\">2<\/a>, <a href=\"https:\/\/www.sciencedirect.com\/topics\/computer-science\/baum-welch-algorithm#:~:text=The%20Baum%2DWelch%20algorithm%2C%20also%20known%20as%20the,through%20the%20computation%20of%20various%20probability%20values.\">3<\/a>, <a href=\"https:\/\/medium.com\/analytics-vidhya\/baum-welch-algorithm-for-training-a-hidden-markov-model-part-2-of-the-hmm-series-d0e393b4fb86#:~:text=Also%2C%20although%20we%20have%20not%20talked%20too,the%20parameters%20and%20see%20what%20works%20better.\">4<\/a>]<\/p>\n\n\n\n<p>How it works<\/p>\n\n\n\n<p>The algorithm iteratively refines the HMM&#8217;s parameters through two main steps:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Expectation (E-step):<\/strong> Given the current HMM parameters, calculate the expected number of times each state transition and emission is used to generate the observed sequence. This is where the forward-backward algorithm is applied to compute the probability of being in a certain state at a given time and the probability of transitioning between states.<\/li>\n\n\n\n<li><strong>Maximization (M-step):<\/strong> Re-estimate the HMM&#8217;s parameters (initial state probabilities, transition probabilities, and emission probabilities) using the expected values calculated in the E-step. The goal is to maximize the likelihood of the observed data given the new parameters. [<a href=\"https:\/\/www.youtube.com\/watch?v=9Igh_OKECxA\">1<\/a>, <a href=\"https:\/\/medium.com\/@ristohinno\/baum-welch-algorithm-4d4514cf9dbe\">2<\/a>, <a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC1262623\/\">7<\/a>, <a href=\"https:\/\/www.cs.mcgill.ca\/~kaleigh\/compbio\/hmm\/hmm_paper.html#:~:text=This%20iterative%20algorithm%20has%20two%20steps%2C%20the,emission%20parameters%20are%20updated%20using%20reestimation%20formulas.\">8<\/a>]<\/li>\n<\/ul>\n\n\n\n<p>Applications<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Statistical Computing:<\/strong> Used to learn the parameters of an HMM from data.<\/li>\n\n\n\n<li><strong>Bioinformatics:<\/strong> Applied in areas like sequence alignment and gene finding.<\/li>\n\n\n\n<li><strong>Electrical Engineering:<\/strong> Used in applications such as signal processing and channel estimation.<\/li>\n\n\n\n<li><strong>Hierarchical Reinforcement Learning:<\/strong> Used in tasks like inferring a hierarchical policy from expert demonstrations. [<a href=\"https:\/\/arxiv.org\/abs\/2412.07907\">9<\/a>, <a href=\"https:\/\/arxiv.org\/abs\/2103.12197\">10<\/a>, <a href=\"https:\/\/arxiv.org\/pdf\/cmp-lg\/9405017#:~:text=All%20of%20the%20applications%20mentioned%20crucially%20involve,continuous%20model%20parameters%20using%20well%2Dknown%20statistical%20techniques.\">11<\/a>, <a href=\"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-231#:~:text=Background%20Hidden%20Markov%20Models%20(HMMs)%20are%20widely,2%2C%203%5D%20and%20gene%2Dfinding%20%5B%204%2C%205%5D.\">12<\/a>]<\/li>\n<\/ul>\n\n\n\n<p>Key characteristics<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Iterative:<\/strong> The algorithm repeatedly cycles through the E and M steps until the parameters no longer change significantly.<\/li>\n\n\n\n<li><strong>Local optimum:<\/strong> It does not guarantee finding the global optimum and can converge to a local maximum depending on the initial parameter values.<\/li>\n\n\n\n<li><strong>Uses Forward-Backward Algorithm:<\/strong> The forward-backward algorithm is crucial for computing the necessary expectations in the E-step.<\/li>\n\n\n\n<li><strong>Unsupervised Learning:<\/strong> It can be used in a &#8220;blind&#8221; manner without labeled data, as it learns the model&#8217;s parameters solely from the observed output sequence. [<a href=\"https:\/\/medium.com\/@ristohinno\/baum-welch-algorithm-4d4514cf9dbe\">2<\/a>, <a href=\"https:\/\/medium.com\/analytics-vidhya\/baum-welch-algorithm-for-training-a-hidden-markov-model-part-2-of-the-hmm-series-d0e393b4fb86#:~:text=Also%2C%20although%20we%20have%20not%20talked%20too,the%20parameters%20and%20see%20what%20works%20better.\">4<\/a>, <a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC1262623\/\">7<\/a>, <a href=\"https:\/\/arxiv.org\/abs\/2412.07907\">9<\/a>, <a href=\"https:\/\/www.youtube.com\/watch?v=NeC1ZC0Nwvw\">13<\/a>]<\/li>\n<\/ul>\n\n\n\n<p><em>AI responses may include mistakes.<\/em><\/p>\n\n\n\n<p>[1]&nbsp;<a href=\"https:\/\/www.youtube.com\/watch?v=9Igh_OKECxA\">https:\/\/www.youtube.com\/watch?v=9Igh_OKECxA<\/a><\/p>\n\n\n\n<p>[2]&nbsp;<a href=\"https:\/\/medium.com\/@ristohinno\/baum-welch-algorithm-4d4514cf9dbe\">https:\/\/medium.com\/@ristohinno\/baum-welch-algorithm-4d4514cf9dbe<\/a><\/p>\n\n\n\n<p>[3]&nbsp;<a href=\"https:\/\/www.sciencedirect.com\/topics\/computer-science\/baum-welch-algorithm#:~:text=The%20Baum%2DWelch%20algorithm%2C%20also%20known%20as%20the,through%20the%20computation%20of%20various%20probability%20values.\">https:\/\/www.sciencedirect.com\/topics\/computer-science\/baum-welch-algorithm<\/a><\/p>\n\n\n\n<p>[4]&nbsp;<a href=\"https:\/\/medium.com\/analytics-vidhya\/baum-welch-algorithm-for-training-a-hidden-markov-model-part-2-of-the-hmm-series-d0e393b4fb86#:~:text=Also%2C%20although%20we%20have%20not%20talked%20too,the%20parameters%20and%20see%20what%20works%20better.\">https:\/\/medium.com\/analytics-vidhya\/baum-welch-algorithm-for-training-a-hidden-markov-model-part-2-of-the-hmm-series-d0e393b4fb86<\/a><\/p>\n\n\n\n<p>[5]&nbsp;<a href=\"https:\/\/search.proquest.com\/openview\/108ec6530c4bbb457615cfc2af9b3a3d\/1?pq-origsite=gscholar&amp;cbl=36414#:~:text=The%20step%20of%20testing%20the%20model%20is,to%20generate%20a%20new%20model%20for%20classification.\">https:\/\/search.proquest.com\/openview\/108ec6530c4bbb457615cfc2af9b3a3d\/1?pq-origsite=gscholar&amp;cbl=36414<\/a><\/p>\n\n\n\n<p>[6]&nbsp;<a href=\"https:\/\/link.springer.com\/article\/10.1007\/s10614-021-10122-9\">https:\/\/link.springer.com\/article\/10.1007\/s10614-021-10122-9<\/a><\/p>\n\n\n\n<p>[7]&nbsp;<a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC1262623\/\">https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC1262623\/<\/a><\/p>\n\n\n\n<p>[8]&nbsp;<a href=\"https:\/\/www.cs.mcgill.ca\/~kaleigh\/compbio\/hmm\/hmm_paper.html#:~:text=This%20iterative%20algorithm%20has%20two%20steps%2C%20the,emission%20parameters%20are%20updated%20using%20reestimation%20formulas.\">https:\/\/www.cs.mcgill.ca\/~kaleigh\/compbio\/hmm\/hmm_paper.html<\/a><\/p>\n\n\n\n<p>[9]&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/2412.07907\">https:\/\/arxiv.org\/abs\/2412.07907<\/a><\/p>\n\n\n\n<p>[10]&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/2103.12197\">https:\/\/arxiv.org\/abs\/2103.12197<\/a><\/p>\n\n\n\n<p>[11]&nbsp;<a href=\"https:\/\/arxiv.org\/pdf\/cmp-lg\/9405017#:~:text=All%20of%20the%20applications%20mentioned%20crucially%20involve,continuous%20model%20parameters%20using%20well%2Dknown%20statistical%20techniques.\">https:\/\/arxiv.org\/pdf\/cmp-lg\/9405017<\/a><\/p>\n\n\n\n<p>[12]&nbsp;<a href=\"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-231#:~:text=Background%20Hidden%20Markov%20Models%20(HMMs)%20are%20widely,2%2C%203%5D%20and%20gene%2Dfinding%20%5B%204%2C%205%5D.\">https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-231<\/a><\/p>\n\n\n\n<p>[13]&nbsp;<a href=\"https:\/\/www.youtube.com\/watch?v=NeC1ZC0Nwvw\">https:\/\/www.youtube.com\/watch?v=NeC1ZC0Nwvw<\/a><\/p>\n\n\n\n<p>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. What Is Baum-Welch? Baum-Welch is the Expectation-Maximization (EM) algorithm for unsupervised training of Hidden Markov Models (HMMs). It learns: \u2026from unlabeled observation sequences $ x_{1:T} $. In your RF-to-speech pipeline:Train the word-state HMM on RF-inferred neural activity without word labels. 2. Why Not Supervised Training? 3. HMM Recap (Your Model) Component Formula States $&hellip;&nbsp;<a href=\"https:\/\/172-234-197-23.ip.linodeusercontent.com\/?p=4410\" rel=\"bookmark\"><span class=\"screen-reader-text\">Baum-Welch Algorithm: Complete Technical Explanation<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":4411,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-4410","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-the-truben-show"],"_links":{"self":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/4410","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4410"}],"version-history":[{"count":1,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/4410\/revisions"}],"predecessor-version":[{"id":4412,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/4410\/revisions\/4412"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/media\/4411"}],"wp:attachment":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4410"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4410"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4410"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}