Description of v

0001 function [m,v,w,g,f,pp,gg]=v_gaussmix(x,c,l,m0,v0,w0,wx)
0002 %V_GAUSSMIX fits a gaussian mixture pdf to a set of data observations [m,v,w,g,f]=(x,c,l,m0,v0,w0,wx)
0003 %
0004 % Usage:
0005 %    (1) [m,v,w]=v_gaussmix(x,[],[],k);    % create GMM with k mixtures and diagonal covariances
0006 %    (2) [m,v,w]=gaussmix(x,[],[],k,'v');    % create GMM with k mixtures and full covariances
0007 %
0008 % Inputs: n data values, k mixtures, p parameters, l loops
0009 %
0010 %     X(n,p)   Input data vectors, one per row.
0011 %     C(1)     Minimum variance of normalized data (Use [] to take default value of n^-2)
0012 %     L        The integer portion of l gives a maximum loop count. The fractional portion gives
0013 %              an optional stopping threshold. Iteration will cease if the increase in
0014 %              log likelihood density per data point is less than this value. Thus l=10.001 will
0015 %              stop after 10 iterations or when the increase in log likelihood falls below
0016 %              0.001.
0017 %              As a special case, if L=0, then the first three outputs are omitted.
0018 %              Use [] to take default value of 100.0001
0019 %     M0       Number of mixtures required (or initial mixture means, M0(k,p), - see below)
0020 %     V0       Initialization mode:
0021 %                'f'    Initialize with K randomly selected data points [default]
0022 %                'p'    Initialize with centroids and variances of random partitions
0023 %                'k'    k-means algorithm ('kf' and 'kp' determine initialization)
0024 %                'h'    k-harmonic means algorithm ('hf' and 'hp' determine initialization) [default]
0025 %                's'    use unscaled data during initialization phase instead of scaling it first
0026 %                'm'    M0(k,p) contains the initial centres
0027 %                'v'    full covariance matrices
0028 %              Mode 'hf' [the default] generally gives the best results but 'f' is faster and often OK
0029 %     W0(n,1)  Data point weights (need not sum to unity)
0030 %
0031 %   Alternatively, initial values for M0, V0 and W0 can be given  explicitly:
0032 %
0033 %     M0(k,p)       Initial mixture means, one row per mixture.
0034 %     V0(k,p)       Initial mixture variances, one row per mixture.
0035 %      or V0(p,p,k)    one full-covariance matrix per mixture
0036 %     W0(k,1)       Initial mixture weights, one per mixture. The weights must sum to unity.
0037 %     WX(n,1)       Data point weights (need not sum to unity)
0038 %
0039 % Outputs: (Note that M, V and W are omitted if L==0)
0040 %
0041 %     M(k,p)   Mixture means, one row per mixture. (omitted if L==0)
0042 %     V(k,p)   Mixture variances, one row per mixture. (omitted if L==0)
0043 %       or V(p,p,k) if full covariance matrices in use (i.e. either 'v' option or V0(p,p,k) specified)
0044 %     W(k,1)   Mixture weights, one per mixture. The weights will sum to unity. (omitted if L==0)
0045 %     G       Average log probability of the input data points.
0046 %     F        Fisher's Discriminant measures how well the data divides into classes.
0047 %              It is the ratio of the between-mixture variance to the average mixture variance: a
0048 %              high value means the classes (mixtures) are well separated.
0049 %     PP(n,1)  Log probability of each data point
0050 %     GG(l+1,1) Average log probabilities at the beginning of each iteration and at the end
0051 %
0052 % The fitting procedure uses one of several initialization methods to create an initial guess
0053 % for the mixture centres and then uses the EM (expectation-maximization) algorithm to refine
0054 % the guess. Although the EM algorithm is deterministic, the initialization procedures use
0055 % random numbers and so the routine will not give identical answers if you call it multiple
0056 % times with the same input data. See v_randvec() for generating GMM data vectors.
0057 
0058 %  Bugs/Suggestions
0059 %     (1) Allow processing in chunks by outputting/reinputting an array of sufficient statistics
0060 %     (2) Other initialization options:
0061 %              'l'    LBG algorithm
0062 %              'm'    Move-means (dog-rabbit) algorithm
0063 %     (3) Allow updating of weights-only, not means/variances
0064 %     (4) Allow freezing of means and/or variances
0065 
0066 %      Copyright (C) Mike Brookes 2000-2009
0067 %      Version: $Id: v_gaussmix.m 10865 2018-09-21 17:22:45Z dmb $
0068 %
0069 %   VOICEBOX is a MATLAB toolbox for speech processing.
0070 %   Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
0071 %
0072 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0073 %   This program is free software; you can redistribute it and/or modify
0074 %   it under the terms of the GNU General Public License as published by
0075 %   the Free Software Foundation; either version 2 of the License, or
0076 %   (at your option) any later version.
0077 %
0078 %   This program is distributed in the hope that it will be useful,
0079 %   but WITHOUT ANY WARRANTY; without even the implied warranty of
0080 %   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0081 %   GNU General Public License for more details.
0082 %
0083 %   You can obtain a copy of the GNU General Public License from
0084 %   http://www.gnu.org/copyleft/gpl.html or by writing to
0085 %   Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
0086 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0087 [n,p]=size(x); % n = number of training values, p = dimension of data vector
0088 wn=ones(n,1);
0089 memsize=v_voicebox('memsize');    % set memory size to use
0090 if isempty(c)
0091     c=1/n^2;
0092 else
0093     c=c(1);         % just to prevent legacy code failing
0094 end
0095 fulliv=0;           % initial variance is not full
0096 if isempty(l)
0097     l=100+1e-4;         % max loop count + stopping threshold
0098 end
0099 if nargin<5 || isempty(v0) || ischar(v0)             % no initial values specified for m0, v0, w0
0100     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0101     %  No initialvalues given, so we must use k-means or equivalent
0102     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0103     if nargin<6
0104         if nargin<5 || isempty(v0)
0105             v0='hf';                 % default initialization mode: hf
0106         end
0107         wx=wn;                      % no data point weights
0108     else
0109         wx=w0(:);                   % data point weights
0110     end
0111     wx=wx(:)/sum(wx);               % normalize and force to be a column vector
0112     if any(v0=='m')
0113         k=size(m0,1);
0114     else
0115         k=m0;
0116     end
0117     fv=any(v0=='v');                % full covariance matrices requested
0118     mx0=wx'*x;                      % calculate mean of input data in each dimension
0119     vx0=wx'*x.^2-mx0.^2;            % calculate variance of input data in each dimension
0120     sx0=sqrt(vx0)+(vx0==0);         % scale factor: std of input data in each dimension (or 1 if std==0)
0121     if n<=k                         % each data point can have its own mixture
0122         xs=(x-mx0(wn,:))./sx0(wn,:);          % scale the data
0123         m=xs(mod((1:k)-1,n)+1,:);   % just include all points several times
0124         v=zeros(k,p);               % will be set to floor later
0125         w=zeros(k,1);
0126         w(1:n)=1/n;
0127         if l>0
0128             l=0.1;                  % no point in iterating
0129         end
0130     else                            % more points than mixtures
0131         if any(v0=='s')
0132             xs=x;                   % do not scale data during initialization
0133         else
0134             xs=(x-mx0(wn,:))./sx0(wn,:);  % else scale now
0135             if any(v0=='m')
0136                 m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:);  % scale specified means as well
0137             end
0138         end
0139         w=repmat(1/k,k,1);                      % all mixtures equally likely
0140         if any(v0=='k')                         % k-means initialization
0141             if any(v0=='m')
0142                 [m,e,j]=v_kmeans(xs,k,m);
0143             elseif any(v0=='p')
0144                 [m,e,j]=v_kmeans(xs,k,'p');
0145             else
0146                 [m,e,j]=v_kmeans(xs,k,'f');
0147             end
0148         elseif any(v0=='h')                     % k-harmonic means initialization
0149             if any(v0=='m')
0150                 [m,e,j]=v_kmeanhar(xs,k,[],4,m);
0151             else
0152                 if any(v0=='p')
0153                     [m,e,j]=v_kmeanhar(xs,k,[],4,'p');
0154                 else
0155                     [m,e,j]=v_kmeanhar(xs,k,[],4,'f');
0156                 end
0157             end
0158         elseif any(v0=='p')                     % Initialize using a random partition
0159             j=ceil(rand(n,1)*k);                % allocate to random clusters
0160             j(v_rnsubset(k,n))=1:k;               % but force at least one point per cluster
0161             for i=1:k
0162                 m(i,:)=mean(xs(j==i,:),1);
0163             end
0164         else
0165             if any(v0=='m')
0166                 m=m0;                           % use specified centres
0167             else
0168                 m=xs(v_rnsubset(k,n),:);          % Forgy initialization: sample k centres without replacement [default]
0169             end
0170             [e,j]=v_kmeans(xs,k,m,0);             % find out the cluster allocation
0171         end
0172         if any(v0=='s')
0173             xs=(x-mx0(wn,:))./sx0(wn,:);      % scale data now if not done previously
0174         end
0175         v=zeros(k,p);                   % diagonal covariances
0176         w=zeros(k,1);
0177         for i=1:k
0178             ni=sum(j==i);               % number assigned to this centre
0179             w(i)=(ni+1)/(n+k);          % weight of this mixture
0180             if ni
0181                 v(i,:)=sum((xs(j==i,:)-repmat(m(i,:),ni,1)).^2,1)/ni;
0182             else
0183                 v(i,:)=zeros(1,p);
0184             end
0185         end
0186     end
0187 else
0188     %%%%%%%%%%%%%%%%%%%%%%%%
0189     % use initial values given as input parameters
0190     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0191     if nargin<7
0192         wx=wn;              % no data point weights
0193     end
0194     wx=wx(:)/sum(wx); % normalize weights and force a column vector
0195     mx0=wx'*x;         % calculate mean of input data in each dimension
0196     vx0=wx'*x.^2-mx0.^2; % calculate variance of input data in each dimension
0197     sx0=sqrt(vx0);
0198     sx0(sx0==0)=1;      % do not divide by zero when scaling
0199     [k,p]=size(m0);
0200     xs=(x-mx0(wn,:))./sx0(wn,:);          % scale the data
0201     m=(m0-mx0(ones(k,1),:))./sx0(ones(k,1),:);          % and the means
0202     v=v0;
0203     w=w0;
0204     fv=ndims(v)>2 || size(v,1)>k;                       % full covariance matrix is supplied
0205     if fv
0206         mk=eye(p)==0;                                    % off-diagonal elements
0207         fulliv=any(v(repmat(mk,[1 1 k]))~=0);            % check if any are non-zero
0208         if ~fulliv
0209             v=reshape(v(repmat(~mk,[1 1 k])),p,k)'./repmat(sx0.^2,k,1);   % just pick out and scale the diagonal elements for now
0210         else
0211             v=v./repmat(sx0'*sx0,[1 1 k]);              % scale the full covariance matrix
0212         end
0213     end
0214 end
0215 if length(wx)~=n
0216     error('%d datapoints but %d weights',n,length(wx));
0217 end
0218 lsx=sum(log(sx0));
0219 xsw=xs.*repmat(wx,1,p); % weighted data points
0220 if ~fulliv          % initializing with diagonal covariance if v0 is either unspecified or diagonal
0221     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0222     % Diagonal Covariance matrices  %
0223     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0224     v=max(v,c);         % apply the lower bound
0225     xs2=xs.^2.*repmat(wx,1,p);          % square and weight the data for variance calculations
0226     
0227     % If data size is large then do calculations in chunks
0228     
0229     nb=min(n,max(1,floor(memsize/(8*p*k))));    % chunk size for testing data points
0230     nl=ceil(n/nb);                  % number of chunks
0231     jx0=n-(nl-1)*nb;                % size of first chunk
0232     
0233     im=repmat(1:k,1,nb); im=im(:);
0234     th=(l-floor(l))*n;
0235     sd=(nargout > 3*(l~=0)); % = 1 if we are outputting log likelihood values
0236     lp=floor(l)+sd;   % extra loop needed to calculate final G value
0237     
0238     lpx=zeros(1,n);             % log probability of each data point
0239     wk=ones(k,1);
0240     wp=ones(1,p);
0241     wnb=ones(1,nb);
0242     wnj=ones(1,jx0);
0243     
0244     % EM loop
0245     
0246     g=0;                            % dummy initial value for comparison
0247     gg=zeros(lp+1,1);
0248     ss=sd;                          % initialize stopping count (0 or 1)
0249     for j=1:lp
0250         g1=g;                       % save previous log likelihood (2*pi factor omitted)
0251         m1=m;                       % save previous means, variances and weights
0252         v1=v;
0253         w1=w;
0254         vi=-0.5*v.^(-1);                % data-independent scale factor in exponent
0255         lvm=log(w)-0.5*sum(log(v),2);   % log of external scale factor (excluding -0.5*p*log(2pi) term)
0256         
0257         % first do partial chunk (of length jx0)
0258         
0259         jx=jx0;
0260         ii=1:jx;                        % indices of data points in this chunk
0261         kk=repmat(ii,k,1);              % kk(k,jx): data point index
0262         km=repmat(1:k,1,jx);            % km(k,jx): mixture index
0263         py=reshape(sum((xs(kk(:),:)-m(km(:),:)).^2.*vi(km(:),:),2),k,jx)+lvm(:,wnj); % py(k,jx) pdf of each point with each mixture
0264         mx=max(py,[],1);                % mx(1,jx) find normalizing factor for each data point to prevent underflow when using exp()
0265         px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
0266         ps=sum(px,1);                   % total normalized likelihood of each data point
0267         px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0268         lpx(ii)=log(ps)+mx;
0269         pk=px*wx(ii);                   % pk(k,1) effective fraction of data points for each mixture (could be zero due to underflow)
0270         sx=px*xsw(ii,:);
0271         sx2=px*xs2(ii,:);
0272         for il=2:nl                     % process the data points in chunks
0273             ix=jx+1;
0274             jx=jx+nb;                   % increment upper limit
0275             ii=ix:jx;                   % indices of data points in this chunk
0276             kk=repmat(ii,k,1);
0277             py=reshape(sum((xs(kk(:),:)-m(im,:)).^2.*vi(im,:),2),k,nb)+lvm(:,wnb);
0278             mx=max(py,[],1);            % find normalizing factor for each data point to prevent underflow when using exp()
0279             px=exp(py-mx(wk,:));        % find normalized probability of each mixture for each datapoint
0280             ps=sum(px,1);               % total normalized likelihood of each data point
0281             px=px./ps(wk,:);            % relative mixture probabilities for each data point (columns sum to 1)
0282             lpx(ii)=log(ps)+mx;
0283             pk=pk+px*wx(ii);            % pk(k,1) effective fraction of data points for each mixture (could be zero due to underflow)
0284             sx=sx+px*xsw(ii,:);
0285             sx2=sx2+px*xs2(ii,:);
0286         end
0287         g=lpx*wx;                       % total log probability summed over all data points
0288         gg(j)=g;                        % save log prob at each iteration
0289         w=pk;                           % normalize to get the weights
0290         if pk                           % if all elements of pk are non-zero
0291             m=sx./pk(:,wp);             % calculate mixture means
0292             v=sx2./pk(:,wp);            % and raw second moments
0293         else
0294             wm=pk==0;                   % mask indicating mixtures with zero weights
0295             nz=sum(wm);                  % number of zero-weight mixtures
0296             [vv,mk]=sort(lpx);          % find the lowest probability data points
0297             m=zeros(k,p);               % initialize means and variances to zero (variances are floored later)
0298             v=m;
0299             m(wm,:)=xs(mk(1:nz),:);     % set zero-weight mixture means to worst-fitted data points
0300             w(wm)=1/n;                   % set these weights non-zero
0301             w=w*n/(n+nz);                % normalize so the weights sum to unity
0302             wm=~wm;                     % mask for non-zero weights
0303             m(wm,:)=sx(wm,:)./pk(wm,wp);  % recalculate means and raw second moments for mixtures with a non-zero weight
0304             v(wm,:)=sx2(wm,:)./pk(wm,wp);
0305         end
0306         v=max(v-m.^2,c);                % convert raw second moments to variances and apply floor
0307         if g-g1<=th && j>1
0308             if ~ss, break; end          %  stop
0309             ss=ss-1;                    % stop next time
0310         end
0311     end
0312     if sd && ~fv                        % we need to calculate the final probabilities
0313         pp=lpx'-0.5*p*log(2*pi)-lsx;    % log of total probability of each data point
0314         gg=gg(1:j)-0.5*p*log(2*pi)-lsx; % average log prob at each iteration
0315         g=gg(end);
0316         m=m1;                           % back up to previous iteration
0317         v=v1;
0318         w=w1;
0319         mm=sum(m,1)/k;
0320         f=(m(:)'*m(:)-k*mm(:)'*mm(:))/sum(v(:));
0321     end
0322     if ~fv
0323         m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);    % unscale means
0324         v=v.*repmat(sx0.^2,k,1);                % and variances
0325     else
0326         v1=v;
0327         v=zeros(p,p,k);
0328         mk=eye(p)==1;                           % mask for diagonal elements
0329         v(repmat(mk,[1 1 k]))=v1';              % set from v1
0330     end
0331 end
0332 if fv              % check if full covariance matrices were requested
0333     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0334     % Full Covariance matrices  %
0335     %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0336     pl=p*(p+1)/2;                                           % number of non-trivial elements in lower triangular matrix
0337     lix=1:p^2;
0338     cix=repmat(1:p,p,1);
0339     rix=cix';
0340     lix(cix>rix)=[];                                        % index of lower triangular elements
0341     cix=cix(lix);                                           % index of lower triangular columns
0342     rix=rix(lix);                                           % index of lower triangular rows
0343     % dix=find(rix==cix);                                   % index of lower triangular diagonal elements [unused]
0344     lixi=zeros(p,p);
0345     lixi(lix)=1:pl;
0346     lixi=lixi';
0347     lixi(lix)=1:pl;                                         % reverse index to build full matrices
0348     v=reshape(v,p^2,k);
0349     v=v(lix,:)';                                            % lower triangular in rows (k,p*(p+1)/2)
0350     
0351     % If data size is large then do calculations in chunks
0352     
0353     nb=min(n,max(1,floor(memsize/(24*p*k))));   % chunk size for testing data points
0354     nl=ceil(n/nb);                              % number of chunks
0355     jx0=n-(nl-1)*nb;                            % size of first chunk
0356     %
0357     th=(l-floor(l))*n;
0358     sd=(nargout > 3*(l~=0));                    % = 1 if we are outputting log likelihood values
0359     lp=floor(l)+sd;                             % extra loop needed to calculate final G value
0360     %
0361     lpx=zeros(1,n);                             % log probability of each data point
0362     wk=ones(k,1);
0363     wp=ones(1,p);
0364     wpl=ones(1,pl);                             % 1 index for lower triangular matrix
0365     wnb=ones(1,nb);
0366     wnj=ones(1,jx0);
0367     
0368     % EM loop
0369     
0370     g=0;                        % dummy initial value for comparison
0371     gg=zeros(lp+1,1);
0372     ss=sd;                      % initialize stopping count (0 or 1)
0373     vi=zeros(p*k,p);            % stack of k inverse cov matrices each size p*p
0374     vim=zeros(p*k,1);           % stack of k vectors of the form inv(v)*m
0375     mtk=vim;                      % stack of k vectors of the form m
0376     lvm=zeros(k,1);
0377     wpk=repmat((1:p)',k,1);
0378     for j=1:lp
0379         g1=g;                   % save previous log likelihood (2*pi factor omitted)
0380         m1=m;                    % save previous means, variances and weights
0381         v1=v;
0382         w1=w;
0383         for ik=1:k            
0384             % these lines added for debugging only
0385             %             vk=reshape(v(k,lixi),p,p);
0386             %             condk(ik)=cond(vk);
0387             %%%%%%%%%%%%%%%%%%%%
0388             [uvk,dvk]=eig(reshape(v(ik,lixi),p,p));    % convert lower triangular to full and find eigenvalues
0389             dvk=max(diag(dvk),c);                    % apply variance floor to eigenvalues
0390             vik=-0.5*uvk*diag(dvk.^(-1))*uvk';      % calculate inverse
0391             vi((ik-1)*p+(1:p),:)=vik;               % vi contains all mixture inverses stacked on top of each other
0392             vim((ik-1)*p+(1:p))=vik*m(ik,:)';       % vim contains vi*m for all mixtures stacked on top of each other
0393             mtk((ik-1)*p+(1:p))=m(ik,:)';           % mtk contains all mixture means stacked on top of each other
0394             lvm(ik)=log(w(ik))-0.5*sum(log(dvk));       % vm contains the weighted sqrt of det(vi) for each mixture
0395         end
0396         %
0397         %         % first do partial chunk
0398         %
0399         jx=jx0;
0400         ii=1:jx;
0401         xii=xs(ii,:).';
0402         py=reshape(sum(reshape((vi*xii-vim(:,wnj)).*(xii(wpk,:)-mtk(:,wnj)),p,jx*k),1),k,jx)+lvm(:,wnj);
0403         mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0404         px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
0405         ps=sum(px,1);                   % total normalized likelihood of each data point
0406         px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0407         lpx(ii)=log(ps)+mx;
0408         pk=px*wx(ii);                       % effective fraction of data points for each mixture (could be zero due to underflow)
0409         sx=px*xsw(ii,:);
0410         sx2=px*(xsw(ii,rix).*xs(ii,cix));    % accumulator for variance calculation (lower tri cov matrix as a row)
0411         for il=2:nl
0412             ix=jx+1;
0413             jx=jx+nb;                       % increment upper limit
0414             ii=ix:jx;
0415             xii=xs(ii,:).';
0416             py=reshape(sum(reshape((vi*xii-vim(:,wnb)).*(xii(wpk,:)-mtk(:,wnb)),p,nb*k),1),k,nb)+lvm(:,wnb);
0417             mx=max(py,[],1);                % find normalizing factor for each data point to prevent underflow when using exp()
0418             px=exp(py-mx(wk,:));            % find normalized probability of each mixture for each datapoint
0419             ps=sum(px,1);                   % total normalized likelihood of each data point
0420             px=px./ps(wk,:);                % relative mixture probabilities for each data point (columns sum to 1)
0421             lpx(ii)=log(ps)+mx;
0422             pk=pk+px*wx(ii);                % effective fraction of data points for each mixture (could be zero due to underflow)
0423             sx=sx+px*xsw(ii,:);             % accumulator for mean calculation
0424             sx2=sx2+px*(xsw(ii,rix).*xs(ii,cix));    % accumulator for variance calculation
0425         end
0426         g=lpx*wx;                           % total log probability summed over all data points
0427         gg(j)=g;                            % save convergence history
0428         w=pk;                               % w(k,1) normalize to get the column of weights
0429         if pk                               % if all elements of pk are non-zero
0430             m=sx./pk(:,wp);                 % find mean and mean square
0431             v=sx2./pk(:,wpl);
0432         else
0433             wm=pk==0;                       % mask indicating mixtures with zero weights
0434             nz=sum(wm);                     % number of zero-weight mixtures
0435             [vv,mk]=sort(lpx);              % find the lowest probability data points
0436             m=zeros(k,p);                   % initialize means and variances to zero (variances are floored later)
0437             v=zeros(k,pl);
0438             m(wm,:)=xs(mk(1:nz),:);         % set zero-weight mixture means to worst-fitted data points
0439             w(wm)=1/n;                      % set these weights non-zero
0440             w=w*n/(n+nz);                   % normalize so the weights sum to unity
0441             wm=~wm;                         % mask for non-zero weights
0442             m(wm,:)=sx(wm,:)./pk(wm,wp);    % recalculate means and variances for mixtures with a non-zero weight
0443             v(wm,:)=sx2(wm,:)./pk(wm,wpl);
0444         end
0445         v=v-m(:,cix).*m(:,rix);             % subtract off mean squared
0446         if g-g1<=th && j>1
0447             if ~ss, break; end              %  stop
0448             ss=ss-1;                        % stop next time
0449         end
0450     end
0451     if sd                                   % we need to calculate the final probabilities
0452         pp=lpx'-0.5*p*log(2*pi)-lsx;        % log of total probability of each data point
0453         gg=gg(1:j)-0.5*p*log(2*pi)-lsx;     % average log prob at each iteration
0454         g=gg(end);
0455         %             gg' % *** DEBUG ONLY ***
0456         m=m1;                                           % back up to previous iteration
0457         v=zeros(p,p,k);                                 % reserve spave for k full covariance matrices
0458         trv=0;                                          % sum of variance matrix traces
0459         for ik=1:k                                      % loop for each mixture to apply variance floor
0460             [uvk,dvk]=eig(reshape(v1(ik,lixi),p,p));    % convert lower triangular to full and find eigenvectors
0461             dvk=max(diag(dvk),c);                       % apply variance floor to eigenvalues
0462             v(:,:,ik)=uvk*diag(dvk)*uvk';               % reconstitute full matrix
0463             trv=trv+sum(dvk);                           % add trace to the sum
0464         end
0465         w=w1;
0466         mm=sum(m,1)/k;
0467         f=(m(:)'*m(:)-k*mm(:)'*mm(:))/trv;
0468     else
0469         v1=v;                                           % lower triangular form
0470         v=zeros(p,p,k);                                 % reserve space for k full covariance matrices
0471         for ik=1:k                                      % loop for each mixture to apply variance floor
0472             [uvk,dvk]=eig(reshape(v1(ik,lixi),p,p));    % convert lower triangular to full and find eigenvectors
0473             dvk=max(diag(dvk),c);                       % apply variance floor
0474             v(:,:,ik)=uvk*diag(dvk)*uvk';               % reconstitute full matrix
0475         end
0476     end
0477     m=m.*sx0(ones(k,1),:)+mx0(ones(k,1),:);             % unscale means
0478     v=v.*repmat(sx0'*sx0,[1 1 k]);
0479 end
0480 if l==0                                                 % suppress the first three output arguments if l==0
0481     m=g;
0482     v=f;
0483     w=pp;
0484 end
0485
v_gaussmix

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SOURCE CODE