FPGASpeechRecogitio
SimpleSpeechRecogitioSystemusigMATLABadVHDLoAlteraDE0.DemoVideohere
ItroductioThisprojectisatrialtodevelopasimplespeechrecogitioegieolow-edadeducatioalFPGAslikeAlteraDE0.Alsoasimplechallegetoexhaustthelimitsoflow-edFPGAsadtammigthemtodoadvacedstuff.Thesystemwasdesigedsoastorecogizethedigit(1or0)beigspokeitothemicrophoeoflaptopthetrasferreditoFPGAoverUART.Bothidustryadacademiahavespetacosiderableeffortithisfieldfordevelopigsoftwareadhardwaretocomeupwitharobustsolutio.However,itisbecauseoflargeumberofaccetsspokearoudtheworldthatthiscoudrumstillremaisaactiveareaofresearch.
SpeechRecogitiofidsumerousapplicatiosicludighealthcare,artificialitelligece,humacomputeriteractio,IteractiveVoiceResposeSystems,military,avioicsetc.Aothermostimportatapplicatioresidesihelpigthephysicallychallegedpeopletoiteractwiththeworldiabetterway.
TheorySpeechrecogitiosystemscabeclassifieditoseveralmodelsbydescribigthetypesofutteracestoberecogized.Theseclassesshalltakeitocosideratiotheabilitytodetermietheistacewhethespeakerstartsadfiishestheutterace.IthisprojectIaimedtoimplemetIsolatedWordRecogitioSystemwhichusuallyusedahammigwidowoverthewordbeigspoke.
TheSpeechRecogitioEgiesarebroadlyclassifiedito2types,amelyPatterRecogitioadAcousticPhoeticsystems.Whiletheformerusethekow/traiedpatterstodetermieamatch,thelatterusesattributesofthehumabodytocomparespeechfeatures(phoeticssuchasvowelsouds).Thepatterrecogitiosystemscombiewithcurretcomputigtechiquesadtedtohavehigheraccuracy.
basicstructureofaspeechrecogitiosystemgoesasfollows:SpeechSigalRecordig.SpectralAalysis(FFT,Widowig,MFCC,PowerSpectrum).ProbabilityEstimatio(NeuralNetworks,HiddeMarkovModel,VQ).SigalDecodigadDecisioMakig.AudioSigalsarecapturedusigmicrophoesadit’srecordedithetimedomai(i.e.varieswithtime).Theproblemwithhumavoicesigalsthattheyareotstatioaryadtheaalysisofsuchsigalsitimedomaiisverycomplicatedproblemadcomputatioallycostly.Herecomestheroleofspectralaalysis,bydoigasetoftrasformatiosadprocessigalgorithmsotheicomigsigal,itiscoverteditoausableformthatfurtheraalysiscabedoeoit.
ForthisI'mareusig:DFT:ThediscreteFouriertrasform(DFT)covertsafiitesequeceofequally-spacedsamplesofafuctioitoaequivalet-legthsequeceofequally-spacedsamplesofthediscrete-timeFouriertrasform(DTFT),whichisacomplex-valuedfuctiooffrequecy.
HammigWidow:WheeveryoudoafiiteFouriertrasform,youareimplicitlyapplyigittoaifiitelyrepeatigsigal.So,ifthestartadedofthefiitesampledo'tmatchthethatwilllookjustlikeadiscotiuityithesigal,adshowupaslotsofhigh-frequecyoseseitheFouriertrasform,whichyoudo'twat.
Adifthesamplehappestobeabeautifulsiusoidbutaitegerumberofperiodsdo'thappetofitexactlyitothefiitesample,yourFTwillshowappreciableeergyiallsortsofplacesowhereeartherealfrequecy.
Widowigthedatamakessurethattheedsmatchupwhilekeepigeverythigreasoablysmooth;thisgreatlyreducesthesortof"spectralleakage".
EuclideaDistace:TheEuclideadistaceorEuclideametricisthe"ordiary"straight-liedistacebetweetwopoitsiEuclideaspace.Withthisdistace,Euclideaspacebecomesametricspace.TheassociatedormiscalledtheEuclideaorm.OlderliteraturereferstothemetricasPythagoreametric.
HammigDistace:Iiformatiotheory,theHammigdistacebetweetwostrigsofequallegthistheumberofpositiosatwhichthecorrespodigsymbolsaredifferet.Iotherwords,itmeasuresthemiimumumberofsubstitutiosrequiredtochageoestrigitotheother,orthemiimumumberoferrorsthatcouldhavetrasformedoestrigitotheother.Iamoregeeralcotext,theHammigdistaceisoeofseveralstrigmetricsformeasurigtheeditdistacebetweetwosequeces.
FFT:TheFFTisafast,O[Nlog(N)]algorithmtocomputetheDiscreteFourierTrasform(DFT),whichaivelyisaO[N^2]computatio.TheFFToperatesbydecomposigaNpoittimedomaisigalitoNtimedomaisigalseachcomposedofasiglepoit.ThesecodstepistocalculatetheNfrequecyspectracorrespodigtotheseNtimedomaisigals.Lastly,theNspectraaresythesizeditoasiglefrequecyspectrum.
ImplemetatioThesystemwasfirstitededtobedevelopeditheFPGAolywithoutexteralequipmetsbutitwasimpossibletodosoduetothelimitedcapabilitiesoftheboardIhave,soIdividedtheprojectito2stages,thefrot-ed(sigalacquisitioadaalysis)adtheback-ed(pattermatchigadestimatio,decisiomakigadUI).
Froted(MATLAB):ThefrotedisbuiltitomatlabduetotheeaseofdoigDSPoitusigbuiltifuctios,wehave2programs,oefortraiigadobtaiigameasigaladtheotherforrealtimeoperatio.stepsdoeimatlabare:
DataAcquisitiousigmicrophoe.Widowig&FastFourierTrasformPlottig&DataTrasmissio.FilesitheFroted:[trai.m,recorder.m]
Backed(AlteraDE0):DuetothelackofADCiAlteraDE0I'mtrasmittigthedatafromthecomputer’smicrophoeusigUSBtoTTLmoduleovertheuartprotocol,thereceiveddataoflegth(1000)samplesarecomparedthewiththesavedvectorsfromthetraiigwithmatlab,theeuclideadistacesarecalculatedadthevectorwithmoreprobabilitytobetherightoeisgiveabiggerweight,weightsarethecomparedthedisplayigthefialresultso7-SegmetsadLEDs.
ThebackedwasmodelledasaMooreFiiteStateMachiewith4states:(Receivig,CalculatigDistace,DecisioMakig,DisplayigResults).FilesitheBacked:[Voice_Recogitio.vhd,uart_tx.vhd,uart_rx.vhd,uart_parity.vhd,uart.vhd]
DesigChoicesadWorkAroudsEuclideaDistaceCalculatio:Calculatiooftheeuclideadistacefor1000poitlegthvectorisveryexpesivetodoiFPGAdirectlyusigforloops,soIdidalittletrickadcalculatedtheweightsofvectorsidirectly,byolycoutigthestateswherethedistaceequalszero,thisapproachissimilartousigK-earesteighbourimachielearig.Iotherwordswearereallycalculatighammigdistaceiversely.
FFTPoitsDiscardig:DuetotheirrelevaceofallthefrequeciesIolytook1000poitsaddiscardedthewholesigal,alsowhiletakigtheFFTIdiscardedhalfthesigalsduetosymmetryoftheoutput.
MooreFSM:Thedesigwasmadeimooremachieforautomaticrecogitioadtodecreasetheuseriteractiowiththesystem,alsoforcomplexityreductio.
UARTModule:UARTwasusedithemodulefortrasmittigdataduetothelimitatiosoftheFPGABoard,adduetothesimplicityofimplemetatioadavailablitiyofcoversiomodulesithemarket.
ResultsRAMCosumptioaroud380MBoubutu16.04LTSforthefroted.LogicElemetsCosumptiois13,757LE.Cosumes9144Registerad10,450LogicFuctios.Uses46PisfortheUIadDataIterface.Accuracy90%forthesamespeaker,decreaseswithspeakerchagig.Cadetect2Numbers(oeadzero)Coclusio:ItwasshowherethatitispossibletoimplemetabasicspeechrecogitiosystemoAlteraDE0adit’spossibletoovercomethelimitedcapabilitiesofthehardwarebymaysoftwareworkarouds.
Thesystemisabletosuccessfullyrecogizetwodigits(1ad0)toagreataccuracyforthesamespeaker.Thesystemspeakerdepedettoagreatextetduetothelowumberoftestigsamples,thiscabeimprovedbymakigabiggerdatasetfromvariousspeakers,alsobycalculatigadcomparigtheMFCCswithFFTtheapplicatiowillbemoreeffectiveadwithaveryhighaccuracy.
Theavailabilityofmorepowerfulhardware,willallowmetoeasilyimplemetmorerobustalgorithmslikeHiddeMarkovModelsadusemorepowerfulADCChipstorecordsoudmorepurelyresultigimoreaccurateresults.
评论