V_RANDISCR Generate discrete random numbers with specified probabiities [X]=(P,N,A) Usage: (1) v_randiscr([],10) % generate 10 uniform random binary values (2) v_randiscr(2:6,10) % generate 10 random numbers in the range 1:5 with probabilities [2 3 4 5 6]/20 (3) v_randiscr([],10,'abcd') % generate a string of 10 random characters equiprobable from 'abcd' (4) v_randiscr([],-3,'abcd') % generate a string of 3 distinct random characters equiprobable from 'abcd' Inputs: P vector or n-D array of probabilities (not necessarily normalized) [default = uniform] N number of random values to generate: +ve=with and -ve=without replacement [default = 1] A output alphabet [default = 1:length(p) or 0:1 if p is empty] Outputs: X vector of not necessarily distinct values taken from alphabet A If A is omitted and P is a matrix, each row of X(N,M) will contain the M coordinates of a random element of P. The vector P is internally normalized by dividing by its sum.
0001 function x=v_randiscr(p,n,a) 0002 %V_RANDISCR Generate discrete random numbers with specified probabiities [X]=(P,N,A) 0003 % 0004 % Usage: (1) v_randiscr([],10) % generate 10 uniform random binary values 0005 % (2) v_randiscr(2:6,10) % generate 10 random numbers in the range 1:5 0006 % with probabilities [2 3 4 5 6]/20 0007 % (3) v_randiscr([],10,'abcd') % generate a string of 10 random 0008 % characters equiprobable from 'abcd' 0009 % (4) v_randiscr([],-3,'abcd') % generate a string of 3 distinct random 0010 % characters equiprobable from 'abcd' 0011 % 0012 % Inputs: P vector or n-D array of probabilities (not necessarily normalized) [default = uniform] 0013 % N number of random values to generate: +ve=with and -ve=without replacement [default = 1] 0014 % A output alphabet [default = 1:length(p) or 0:1 if p is empty] 0015 % 0016 % Outputs: X vector of not necessarily distinct values taken from alphabet A 0017 % If A is omitted and P is a matrix, each row of X(N,M) will contain the M coordinates 0018 % of a random element of P. 0019 % 0020 % The vector P is internally normalized by dividing by its sum. 0021 0022 % Somewhat similar in function to RANDSRC in the comms toolbox 0023 0024 % Revisions: 0025 % 2020-08-01 Fixed error when n=-(alphabet size) 0026 % 2020-11-03 Clarified comments and speeded up non-replacement sampling 0027 % with uniform probabilities 0028 % 0029 % Copyright (c) 2005-2012 Mike Brookes, mike.brookes@ic.ac.uk 0030 % Version: $Id: v_randiscr.m 10865 2018-09-21 17:22:45Z dmb $ 0031 % 0032 % VOICEBOX is a MATLAB toolbox for speech processing. 0033 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0034 % 0035 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0036 % This program is free software; you can redistribute it and/or modify 0037 % it under the terms of the GNU General Public License as published by 0038 % the Free Software Foundation; either version 2 of the License, or 0039 % (at your option) any later version. 0040 % 0041 % This program is distributed in the hope that it will be useful, 0042 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0043 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0044 % GNU General Public License for more details. 0045 % 0046 % You can obtain a copy of the GNU General Public License from 0047 % http://www.gnu.org/copyleft/gpl.html or by writing to 0048 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0049 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0050 gota=nargin>2; 0051 if nargin<1 || ~numel(p) 0052 if gota 0053 p=ones(1,length(a)); 0054 else 0055 p=ones(1,2); 0056 a=(0:1)'; 0057 gota=1; 0058 end 0059 end 0060 if nargin<2 || ~numel(n) 0061 n=1; 0062 end 0063 q=p(:); 0064 d=length(q); % size of output alphabet 0065 if gota && d~=numel(a) 0066 error('sizes of alphabet and probability vector/matrix must match'); 0067 end 0068 if n>1 % sample with replacement 0069 dn=d+n-1; 0070 z=zeros(dn,1); % array to hold random numbers 0071 z(1:d)=cumsum(q/sum(q)); % last value is overwritten in the next line 0072 z(d:end)=rand(n,1); % create a random number for each sample 0073 [y,iy]=sort(z); % the original entries z(1:d-1) now divide the samples into bins for each possible value 0074 y(iy)=(1:dn)'; % create inverse index 0075 m=zeros(dn,1); 0076 m(y(1:d-1))=1; % set original initial d-1 values to 1 0077 m(1)=m(1)+1; % and also the first value 0078 m=cumsum(m); % label each of the random nuumbers with its corresponding value in 1:d 0079 x=m(y(d:dn)); % find the partitions of the n random numbers 0080 else % sample without replacement 0081 n=abs(n); % number of samples needed 0082 if n>d 0083 error('Number of output samples exceeds alphabet size'); 0084 end 0085 if all(q==q(1)) % if all probabilities are equal 0086 [y,iy]=sort(rand(d,1)); % iy is a random permutation of 1:d 0087 x=iy(1:n); % select the first n values 0088 else 0089 f=(1:d)'; % initialize the alphabet to 1:d 0090 x=zeros(n,1); % space for the output samples 0091 nn=n; % number of samples remaining 0092 while nn>0 % number of samples remaining 0093 dd=numel(f); % alphabet size (always >=nn) 0094 if dd==1 % singleton alphabet 0095 x(n)=f; % set remaining sample to the only remaining value 0096 nn=0; % and set remaining value count to zero 0097 else 0098 qq=q(f); % remaiing alphabet probabilities 0099 dn=dd+nn-1; 0100 z=zeros(dn,1); % array to hold random numbers 0101 z(1:dd)=cumsum(qq/sum(qq)); % last value is overwritten in the next line 0102 z(dd:dn)=rand(nn,1); % create a random number for each sample 0103 [y,iy]=sort(z); % the original entries z(1:dd-1) now divide the samples into bins for each possible value 0104 y(iy)=(1:dn)'; % create inverse index 0105 m=zeros(dn,1); 0106 m(y(1:dd-1))=1; % set original initial dd-1 values to 1 0107 m(1)=m(1)+1; % and also the first value 0108 m=cumsum(m); % label each of the random nuumbers with its corresponding value in 1:dd 0109 z=m(y(dd:dn)); % find the partitions of the nn random numbers 0110 [y,iy]=sort(z); % sorts the positions within each partition 0111 z(iy(1+find(y(1:nn-1)==y(2:nn))))=[]; % keep only the first sample from each partition 0112 k=numel(z); % number of new samples 0113 x(n-nn+1:n-nn+k)=f(z); % add new samples into output array 0114 nn=nn-k; % number of remaining samples 0115 f(z)=[]; % remove used entries from alphabet 0116 end 0117 end 0118 end 0119 end 0120 if gota 0121 x=a(x); % select from output alphabet 0122 elseif length(q)>length(p) % need multiple dimensions 0123 v=x-1; 0124 s=cumprod(size(p)); 0125 m=length(s); 0126 s(2:end)=s(1:end-1); 0127 s(1)=1; 0128 x=zeros(n,m); 0129 for i=m:-1:1 0130 x(:,i)=1+floor(v/s(i)); % find indices starting with the last 0131 v=rem(v,s(i)); 0132 end 0133 end