V_ADDNOISE Add noise at a chosen SNR [z,p,fso]=(s,fsx,snr,m,nb,fsa) Usage: (1) z=v_addnoise(s,fs,snr); % add white noise using P.56 for speech power with total ouput level at 0 dB (2) z=v_addnoise(s,fs,snr,'',n,fn); % add noise from column vector n() at sample frequency fn with random start % sample and wrapping around as required with a vorbis cross-fade (3) z=v_addnoise(s,fs,snr,'AD'); % use A-weighting when calculating the SNR which is specified as a power ratio (4) z=v_addnoise(s,fs,snr,'f',b,a); % generate noise using filter(b,a,randn(*,1)) but avoiding startup transient (5) z=v_addnoise(s,fs,snr,'g',11); % add speech-shaped noise (noise 11 from v_stdspectrum.m)and plot a graph Inputs: s input signal (column vector) fsx sampling frequency (Hz) or fso output from previous call snr target SNR in dB (or power ratio if 'D' option given) [default: Inf] m mode string [omit or use '' to get default of 'dxopEk'] (1) 'A' use A-weighting when calculating SNR (2) 'd' SNR input and p output given in dB [default] 'D' SNR input and p output given as power ratio (3) 'S' no noise wrapping 'w' wrap noise if it is too short 'W' wrap noise with vorbis cross-fade of +-50 ms [default] (4) 'b' Go fom the beginning of the noise signal 'o' Add a random noise offset even if it means extra wraps [default] 'O' Add a random noise offset but minimize number of wraps (5) 'p' Use P.56 to calculate signal level [default] 'e' Use energy to calculate signal level 'P' Use P.56 to calculate noise level 'E' Use energy to calculate noise level [default] (6) 'z' Assume signal is already at 0 dB 'Z' Assume noise is already at 0 dB (7) 'n' make signal level = 0dB 'N' make noise level = 0dB 't' make sum of speech and noise levels = 0dB [default] 'k' preserve original signal power 'K' preserve original noise power (8) 'x' the output z contains input and noise as separate columns (8) 'f' Inputs nb and fsa specify a z-domain filter nb(z)/fsa(z) applied to Gaussian noise (9) 'g' plot graph [default if no output arguments] nb noise signal or v_stdspectrum type or numerator of noise filter if 'f' option (default: white noise) fsa noise sample frequency [default fsx] or denominator of noise filter if 'f' option Outputs: z noisy signal (single column unless 'x' option given p levels in dB or power (see 'dD' options): [s-in n-in s-out n-out] fso state output for fsx input to future call to allow s to be processed in blocks The noise can have a different sample rate from the signal (fsa and fsx inputs respectively) but if you will be re-using the noise multiple times it is more efficient to use v_resample() on the noise signal. This is the same as resample() but discards the initial and final filter transients. By using the fso output as the fsx input in a subsequent call, you can process a long speech signal in chuncks with the same noise. Note that the speech and noise levels determined fom the first chunck are used for all subsequent chuncks.
0001 function [z,p,fso]=v_addnoise(s,fsx,snr,m,nb,fsa) 0002 %V_ADDNOISE Add noise at a chosen SNR [z,p,fso]=(s,fsx,snr,m,nb,fsa) 0003 % 0004 % Usage: (1) z=v_addnoise(s,fs,snr); % add white noise using P.56 for speech power with total ouput level at 0 dB 0005 % (2) z=v_addnoise(s,fs,snr,'',n,fn); % add noise from column vector n() at sample frequency fn with random start 0006 % % sample and wrapping around as required with a vorbis cross-fade 0007 % (3) z=v_addnoise(s,fs,snr,'AD'); % use A-weighting when calculating the SNR which is specified as a power ratio 0008 % (4) z=v_addnoise(s,fs,snr,'f',b,a); % generate noise using filter(b,a,randn(*,1)) but avoiding startup transient 0009 % (5) z=v_addnoise(s,fs,snr,'g',11); % add speech-shaped noise (noise 11 from v_stdspectrum.m)and plot a graph 0010 % 0011 % Inputs: s input signal (column vector) 0012 % fsx sampling frequency (Hz) or fso output from previous call 0013 % snr target SNR in dB (or power ratio if 'D' option given) [default: Inf] 0014 % m mode string [omit or use '' to get default of 'dxopEk'] 0015 % (1) 'A' use A-weighting when calculating SNR 0016 % (2) 'd' SNR input and p output given in dB [default] 0017 % 'D' SNR input and p output given as power ratio 0018 % (3) 'S' no noise wrapping 0019 % 'w' wrap noise if it is too short 0020 % 'W' wrap noise with vorbis cross-fade of +-50 ms [default] 0021 % (4) 'b' Go fom the beginning of the noise signal 0022 % 'o' Add a random noise offset even if it means extra wraps [default] 0023 % 'O' Add a random noise offset but minimize number of wraps 0024 % (5) 'p' Use P.56 to calculate signal level [default] 0025 % 'e' Use energy to calculate signal level 0026 % 'P' Use P.56 to calculate noise level 0027 % 'E' Use energy to calculate noise level [default] 0028 % (6) 'z' Assume signal is already at 0 dB 0029 % 'Z' Assume noise is already at 0 dB 0030 % (7) 'n' make signal level = 0dB 0031 % 'N' make noise level = 0dB 0032 % 't' make sum of speech and noise levels = 0dB [default] 0033 % 'k' preserve original signal power 0034 % 'K' preserve original noise power 0035 % (8) 'x' the output z contains input and noise as separate columns 0036 % (8) 'f' Inputs nb and fsa specify a z-domain filter nb(z)/fsa(z) applied to Gaussian noise 0037 % (9) 'g' plot graph [default if no output arguments] 0038 % nb noise signal or v_stdspectrum type or numerator of noise filter if 'f' option (default: white noise) 0039 % fsa noise sample frequency [default fsx] or denominator of noise filter if 'f' option 0040 % 0041 % Outputs: z noisy signal (single column unless 'x' option given 0042 % p levels in dB or power (see 'dD' options): [s-in n-in s-out n-out] 0043 % fso state output for fsx input to future call to allow s to be processed in blocks 0044 % 0045 % The noise can have a different sample rate from the signal (fsa and fsx 0046 % inputs respectively) but if you will be re-using the noise multiple times 0047 % it is more efficient to use v_resample() on the noise signal. This is the 0048 % same as resample() but discards the initial and final filter transients. 0049 % By using the fso output as the fsx input in a subsequent call, you can 0050 % process a long speech signal in chuncks with the same noise. Note that 0051 % the speech and noise levels determined fom the first chunck are used for 0052 % all subsequent chuncks. 0053 0054 % Copyright (C) Mike Brookes 2014 0055 % Version: $Id: v_addnoise.m 10461 2018-03-29 13:30:51Z dmb $ 0056 % 0057 % VOICEBOX is a MATLAB toolbox for speech processing. 0058 % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html 0059 % 0060 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0061 % This program is free software; you can redistribute it and/or modify 0062 % it under the terms of the GNU General Public License as published by 0063 % the Free Software Foundation; either version 2 of the License, or 0064 % (at your option) any later version. 0065 % 0066 % This program is distributed in the hope that it will be useful, 0067 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0068 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0069 % GNU General Public License for more details. 0070 % 0071 % You can obtain a copy of the GNU General Public License from 0072 % http://www.gnu.org/copyleft/gpl.html or by writing to 0073 % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA. 0074 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 0075 0076 ns=numel(s); % number of speech samples 0077 if isstruct(fsx) 0078 fs=fsx.fs; % sample frequency 0079 nb=fsx.nb; % noise samples or filter numerator 0080 genf=fsx.genf; % filtered white noise flag 0081 nsc=fsx.nsc; % cumulative sample count 0082 gm=fsx.gm; % gains for speech and noise 0083 p=fsx.p; % output powers 0084 if genf 0085 zof=fsx.zof; % noise generating filter state 0086 fsa=fsx.fsa; % filter denominator 0087 else 0088 ioff=fsx.ioff; % offset in noise signal 0089 end 0090 %% decode input arguments 0091 else 0092 fs=fsx; % sample frequency 0093 nsc=0; % cumulative sample count 0094 if nargin<3 || ~numel(snr) 0095 snr=Inf; 0096 end 0097 if nargin<4 || ~numel(m) 0098 m=''; 0099 end 0100 if nargin<5 || ~numel(nb) 0101 nb=1; % default is white noise 0102 end 0103 if any(m=='A') && (~any(m=='z') || ~any(m=='Z')) 0104 [awb,awa]=v_stdspectrum(2,'z',fs); % create an A-weighting filter 0105 end 0106 if any(m=='z') 0107 se=1; % speech power given as 0 dB 0108 elseif any(m=='A') 0109 sf=filter(awb,awa,s); % apply A-weighting 0110 if any(m=='e') 0111 se=mean(sf.^2); % use normal energy 0112 else 0113 se=activlev(sf,fs); % or else P.56 0114 end 0115 else 0116 if any(m=='e') 0117 se=mean(s.^2); % use normal energy 0118 else 0119 se=activlev(s,fs); % or else P.56 0120 end 0121 end 0122 %% Now sort out the noise 0123 genf=any(m=='f') || ischar(nb) || numel(nb)==1; % generate noise locally 0124 if genf % generate noise locally 0125 if ~any(m=='f') % specify standard spectrum 0126 [nb,fsa]=v_stdspectrum(nb,'z',fs); 0127 end 0128 [dum1,zof,dum2,ne]=v_randfilt(nb,fsa,0); 0129 if any(m=='A') % convolve with A-weighting to find power 0130 [dum1,dum2,dum3,ne]=v_randfilt(conv(nb,awb),conv(fsa,awa),0); 0131 end 0132 %% use the supplied noise signal 0133 else % noise signal supplied 0134 hov=round(0.1*fs); % fading overlap width 0135 moff=min(2*any(m=='b')+any(m=='O'),2); % moff=0='o',1='O', 2='b' 0136 mwr=min(2*any(m=='S')+any(m=='w'),2); % mwr= 0='W',1='w', 2='S' 0137 nn=numel(nb); 0138 if size(nn,2)>1 0139 error('Noise signal must be a column vector'); 0140 end 0141 if nargin<6 0142 fsa=fs; % default sample rate is same as for speech 0143 end 0144 if fsa~=fs % need to do resampling 0145 frat=fs/fsa; 0146 nbadd=ceil(10*max(frat,1)+1); % half-length of filter at output sample rate 0147 nreq=ceil((ns+4*nbadd+hov+10)/frat); % worst case number of input samples we need 0148 if nargout<3 && nreq<nn % just resample the bit we need 0149 ioff=0; % initial offset in original noise signal 0150 if moff<2 0151 if mwr<2 && moff==0 % any offset allowed 0152 ioff=floor(rand(1)*nn); 0153 else 0154 ioff=floor(rand(1)*(nn-nreq)); % choose offset to prevent wrapping 0155 end 0156 end 0157 if nreq+ioff<nn % no wrapping required 0158 nb=resample(reshape(nb(ioff+1:ioff+nreq),[],1),round(fs*10),round(fsa*10)); 0159 moff=2; % start at beginning 0160 mwr=2; % no wrapping 0161 else % resample portions at the beginning and end and define offset to avoid the middle 0162 nb=resample([nb(1:nreq+ioff-nn+ceil(10/frat));nb(ioff+1:nn)],round(fs*10),round(fsa*10)); 0163 joff=length(nb)-floor((nn-ioff)*frat)+5-nbadd; % offset to use when wrapping 0164 moff=3; 0165 end 0166 else % resample the entire noise signal 0167 nb=resample(nb(:),round(fs*10),round(fsa*10)); 0168 if length(nb)<=2*nbadd 0169 error('noise signal too short after resampling'); 0170 end 0171 end 0172 nb=nb(nbadd+1:end-nbadd); % trim ends to avoid resampling transients 0173 nn=numel(nb); 0174 end 0175 %% determine the starting offset within the noise signal 0176 ioff=0; % initial offset in noise signal 0177 switch mwr 0178 case 2 % mwr=2: no wrapping 0179 if nn<ns 0180 error('noise signal too short'); 0181 end 0182 if moff<2 % choose an offset 0183 ioff=floor(rand(1)*(nn-ns)); 0184 end 0185 case 1 % mwr=1: abrupt wrapping 0186 switch moff 0187 case 3 0188 ioff=joff; % use pre-calculated offset 0189 case 1 % minimize number of wraps 0190 ioff=floor(rand(1)*(nn-mod(ns-1,nn))); 0191 case 0 % allow extra wraps 0192 ioff=floor(rand(1)*nn); 0193 end 0194 0195 case 0 % mwr=0: faded wrapping 0196 hov=round(0.1*fs); % overlap is 0.1 seconds 0197 if nn<2*hov 0198 error('noise signal too short'); 0199 end 0200 switch moff 0201 case 3 0202 ioff=joff; % use pre-calculated offset 0203 case 1 % minimize number of wraps 0204 ioff=floor(rand(1)*(nn+1-2*hov-mod(ns-hov-1,nn-hov))); 0205 case 0 % allow any number of wraps 0206 ioff=floor(rand(1)*nn-hov); 0207 end 0208 if nargin>=3 || ioff+ns>nn % we need to create a wrapped noise signal 0209 wf=sin(0.25*pi*(1+cos((1:hov)*pi/(1+hov))))'; 0210 nb(nn-hov+1:nn)=nb(1:hov).*wf(hov:-1:1)+nb(nn-hov+1:nn).*wf; 0211 nb(1:hov)=[]; 0212 end 0213 end 0214 if any(m=='Z') 0215 ne=1; % noise power given as 0 dB 0216 elseif any(m=='A') 0217 sf=filter(awb,awa,nb); % apply A-weighting 0218 if any(m=='P') 0219 ne=activlev(sf,fs); % use P.56 for noise power 0220 else 0221 ne=mean(sf.^2); % or just normal energy 0222 end 0223 else 0224 if any(m=='P') 0225 ne=activlev(nb,fs); % use P.56 for noise power 0226 else 0227 ne=mean(nb.^2); % or just normal energy 0228 end 0229 end 0230 end 0231 %% Determine scaling factors 0232 if any(m=='D') 0233 snre=snr; 0234 else 0235 snre=10^(0.1*snr); % convert from dB 0236 end 0237 if snre>1 % positive SNR -> fix the signal level 0238 if any(m=='n') 0239 sze=1; 0240 elseif any(m=='N') 0241 sze=snre; 0242 elseif any(m=='K') 0243 sze=ne*snre; 0244 elseif any(m=='k') 0245 sze=se; 0246 else 0247 sze=1/(1+1/snre); 0248 end 0249 nze=sze/snre; 0250 else % negative SNR -> fix the noise level 0251 if any(m=='n') 0252 nze=1/snre; 0253 elseif any(m=='N') 0254 nze=1; 0255 elseif any(m=='K') 0256 nze=ne; 0257 elseif any(m=='k') 0258 nze=se/snre; 0259 else 0260 nze=1/(1+snre); 0261 end 0262 sze=nze*snre; 0263 end 0264 pe=[se ne sze nze]; % powers 0265 if (se==0 && sze>0) || (ne==0 && nze>0) || sze==Inf || nze==Inf 0266 if (se==0 && sze>0) || sze==Inf 0267 error('Infinite gain for signal with mode ''%s'': input=%.1f dB, output=%.1f dB',m,db(se)/2,db(sze)/2); 0268 else 0269 error('Infinite gain for noise with mode ''%s'': input=%.1f dB, output=%.1f dB',m,db(ne)/2,db(nze)/2); 0270 end 0271 end 0272 gm=sqrt([sze/(se+(se==0)) nze/(ne+(ne==0))]); % gains for speech and noise 0273 p=pe; 0274 if ~any(m=='D') && nargout~=1 0275 mk=pe~=Inf & pe~=0; 0276 p(mk)=10*log10(pe(mk)); 0277 p(pe==0)=-Inf; 0278 end 0279 end 0280 if gm(2)>0 % only generate noise if it has a non-zero gain 0281 nn=length(nb); % caluculate the length of the noise signal or filter numerator 0282 if genf % we need to generate new noise 0283 if nn==1 && numel(fsa)==1 && nb==1 && fsa==1 0284 n=randn(ns,1); 0285 else 0286 [n,zof]=filter(nb,fsa,randn(ns,1),zof); 0287 end 0288 else % use supplied noise signal 0289 n=nb(1+mod(ioff+nsc:ioff+ns-1+nsc,nn)); 0290 end 0291 if any(m=='x') 0292 z=[gm(1)*s(:) gm(2)*n(:)]; 0293 else 0294 z=gm(1)*s(:)+gm(2)*n(:); 0295 end 0296 elseif any(m=='x') 0297 z=[gm(1)*s(:) zeros(ns,1)]; 0298 else 0299 z=gm(1)*s(:); 0300 end 0301 %% create output state if required 0302 if nargout>2 0303 fso.fs=fs; % sample frequency 0304 fso.genf=genf; % filtered white noise flag 0305 fso.nsc=nsc+ns; % cumulative sample count 0306 fso.nb=nb; % noise signal or filter numerator 0307 fso.gm=gm; % gains for speech and noise 0308 fso.p=p; % output powers 0309 if genf 0310 fso.zof=zof; % noise generating filter state 0311 fso.fsa=fsa; % filter denominator 0312 else 0313 fso.ioff=ioff; % offset in noise signal 0314 end 0315 end 0316 %% now plot if no output arguments 0317 if ~isstruct(fsx) && (~nargout || any(m=='g')) 0318 tax=(1:ns)/fs; 0319 subplot(3,1,3) 0320 zp=sqrt(mean(z.^2)); 0321 if zp>0 0322 lg=20*log10(zp); 0323 else 0324 lg=-Inf; 0325 end 0326 plot(tax,z,'-y',tax([1 ns]),[0 0],':k',tax([1 ns]),[zp zp],'-b'); 0327 texthvc(tax(1),zp,sprintf(' %.1f dB',lg),'lbb'); 0328 axisenlarge([-1.005 -1.05]); 0329 ylim=get(gca,'ylim'); 0330 if snre==0 0331 lg=-Inf; 0332 elseif snre==Inf 0333 lg=Inf; 0334 else 0335 lg=10*log10(snre); 0336 end 0337 texthvc(tax(ns),ylim(2),sprintf('%.1f dB SNR ',lg),'rtb'); 0338 xlabel('Time (s)'); 0339 ylabel('Output'); 0340 subplot(3,1,2) 0341 if nze==0 0342 plot(tax([1 ns]),[0 0],'-y'); 0343 axisenlarge([-1.005 1]); 0344 else 0345 plot(tax,n,'-y',tax([1 ns]),[0 0],':k',tax([1 ns]),sqrt(pe(2))*[1 1],'-b'); 0346 if pe(2)==0 0347 lg=-Inf; 0348 elseif pe(2)==Inf 0349 lg=Inf; 0350 else 0351 lg=10*log10(pe(2)); 0352 end 0353 texthvc(tax(1),sqrt(pe(2)),sprintf(' %.1f dB',lg),'lbb'); 0354 axisenlarge([-1.005 -1.05]); 0355 ylim=get(gca,'ylim'); 0356 if pe(4)/pe(2)==0 0357 lg=-Inf; 0358 elseif pe(4)/pe(2)==Inf 0359 lg=Inf; 0360 else 0361 lg=10*log10(pe(4)/pe(2)); 0362 end 0363 texthvc(tax(ns),ylim(2),sprintf('\\times %.1f dB ',lg),'rtb'); 0364 end 0365 ylabel('Noise'); 0366 subplot(3,1,1) 0367 plot(tax,s,'-y',tax([1 ns]),[0 0],':k',tax([1 ns]),sqrt(pe(1))*[1 1],'-b'); 0368 if pe(1)==0 0369 lg=-Inf; 0370 elseif pe(1)==Inf 0371 lg=Inf; 0372 else 0373 lg=10*log10(pe(1)); 0374 end 0375 texthvc(tax(1),sqrt(pe(1)),sprintf(' %.1f dB',lg),'lbb'); 0376 axisenlarge([-1.005 -1.05]); 0377 ylim=get(gca,'ylim'); 0378 if pe(3)/pe(1)==0 0379 lg=-Inf; 0380 elseif pe(3)/pe(1)==Inf 0381 lg=Inf; 0382 else 0383 lg=10*log10(pe(3)/pe(1)); 0384 end 0385 texthvc(tax(ns),ylim(2),sprintf('\\times %.1f dB ',lg),'rtb'); 0386 ylabel('Signal'); 0387 end