FPGA-Speech-Recognition开源项目

我要开发同款
匿名用户2021年11月23日
132阅读

技术信息

开源地址
https://github.com/didi/di18n
授权协议
GPL-3.0 License

作品详情

FPGASpeechRecogitio

SimpleSpeechRecogitioSystemusigMATLABadVHDLoAlteraDE0.DemoVideohere

Itroductio

Thisprojectisatrialtodevelopasimplespeechrecogitioegieolow-edadeducatioalFPGAslikeAlteraDE0.Alsoasimplechallegetoexhaustthelimitsoflow-edFPGAsadtammigthemtodoadvacedstuff.Thesystemwasdesigedsoastorecogizethedigit(1or0)beigspokeitothemicrophoeoflaptopthetrasferreditoFPGAoverUART.Bothidustryadacademiahavespetacosiderableeffortithisfieldfordevelopigsoftwareadhardwaretocomeupwitharobustsolutio.However,itisbecauseoflargeumberofaccetsspokearoudtheworldthatthiscoudrumstillremaisaactiveareaofresearch.

SpeechRecogitiofidsumerousapplicatiosicludighealthcare,artificialitelligece,humacomputeriteractio,IteractiveVoiceResposeSystems,military,avioicsetc.Aothermostimportatapplicatioresidesihelpigthephysicallychallegedpeopletoiteractwiththeworldiabetterway.

Theory

Speechrecogitiosystemscabeclassifieditoseveralmodelsbydescribigthetypesofutteracestoberecogized.Theseclassesshalltakeitocosideratiotheabilitytodetermietheistacewhethespeakerstartsadfiishestheutterace.IthisprojectIaimedtoimplemetIsolatedWordRecogitioSystemwhichusuallyusedahammigwidowoverthewordbeigspoke.

TheSpeechRecogitioEgiesarebroadlyclassifiedito2types,amelyPatterRecogitioadAcousticPhoeticsystems.Whiletheformerusethekow/traiedpatterstodetermieamatch,thelatterusesattributesofthehumabodytocomparespeechfeatures(phoeticssuchasvowelsouds).Thepatterrecogitiosystemscombiewithcurretcomputigtechiquesadtedtohavehigheraccuracy.

basicstructureofaspeechrecogitiosystemgoesasfollows:SpeechSigalRecordig.SpectralAalysis(FFT,Widowig,MFCC,PowerSpectrum).ProbabilityEstimatio(NeuralNetworks,HiddeMarkovModel,VQ).SigalDecodigadDecisioMakig.AudioSigalsarecapturedusigmicrophoesadit’srecordedithetimedomai(i.e.varieswithtime).Theproblemwithhumavoicesigalsthattheyareotstatioaryadtheaalysisofsuchsigalsitimedomaiisverycomplicatedproblemadcomputatioallycostly.

Herecomestheroleofspectralaalysis,bydoigasetoftrasformatiosadprocessigalgorithmsotheicomigsigal,itiscoverteditoausableformthatfurtheraalysiscabedoeoit.

ForthisI'mareusig:

DFT:ThediscreteFouriertrasform(DFT)covertsafiitesequeceofequally-spacedsamplesofafuctioitoaequivalet-legthsequeceofequally-spacedsamplesofthediscrete-timeFouriertrasform(DTFT),whichisacomplex-valuedfuctiooffrequecy.

HammigWidow:WheeveryoudoafiiteFouriertrasform,youareimplicitlyapplyigittoaifiitelyrepeatigsigal.So,ifthestartadedofthefiitesampledo'tmatchthethatwilllookjustlikeadiscotiuityithesigal,adshowupaslotsofhigh-frequecyoseseitheFouriertrasform,whichyoudo'twat.

Adifthesamplehappestobeabeautifulsiusoidbutaitegerumberofperiodsdo'thappetofitexactlyitothefiitesample,yourFTwillshowappreciableeergyiallsortsofplacesowhereeartherealfrequecy.

Widowigthedatamakessurethattheedsmatchupwhilekeepigeverythigreasoablysmooth;thisgreatlyreducesthesortof"spectralleakage".

EuclideaDistace:TheEuclideadistaceorEuclideametricisthe"ordiary"straight-liedistacebetweetwopoitsiEuclideaspace.Withthisdistace,Euclideaspacebecomesametricspace.TheassociatedormiscalledtheEuclideaorm.OlderliteraturereferstothemetricasPythagoreametric.

HammigDistace:Iiformatiotheory,theHammigdistacebetweetwostrigsofequallegthistheumberofpositiosatwhichthecorrespodigsymbolsaredifferet.Iotherwords,itmeasuresthemiimumumberofsubstitutiosrequiredtochageoestrigitotheother,orthemiimumumberoferrorsthatcouldhavetrasformedoestrigitotheother.Iamoregeeralcotext,theHammigdistaceisoeofseveralstrigmetricsformeasurigtheeditdistacebetweetwosequeces.

FFT:TheFFTisafast,O[Nlog(⁡N)]algorithmtocomputetheDiscreteFourierTrasform(DFT),whichaivelyisaO[N^2]computatio.TheFFToperatesbydecomposigaNpoittimedomaisigalitoNtimedomaisigalseachcomposedofasiglepoit.ThesecodstepistocalculatetheNfrequecyspectracorrespodigtotheseNtimedomaisigals.Lastly,theNspectraaresythesizeditoasiglefrequecyspectrum.

Implemetatio

ThesystemwasfirstitededtobedevelopeditheFPGAolywithoutexteralequipmetsbutitwasimpossibletodosoduetothelimitedcapabilitiesoftheboardIhave,soIdividedtheprojectito2stages,thefrot-ed(sigalacquisitioadaalysis)adtheback-ed(pattermatchigadestimatio,decisiomakigadUI).

Froted(MATLAB):ThefrotedisbuiltitomatlabduetotheeaseofdoigDSPoitusigbuiltifuctios,wehave2programs,oefortraiigadobtaiigameasigaladtheotherforrealtimeoperatio.stepsdoeimatlabare:

DataAcquisitiousigmicrophoe.Widowig&FastFourierTrasformPlottig&DataTrasmissio.

FilesitheFroted:[trai.m,recorder.m]

Backed(AlteraDE0):DuetothelackofADCiAlteraDE0I'mtrasmittigthedatafromthecomputer’smicrophoeusigUSBtoTTLmoduleovertheuartprotocol,thereceiveddataoflegth(1000)samplesarecomparedthewiththesavedvectorsfromthetraiigwithmatlab,theeuclideadistacesarecalculatedadthevectorwithmoreprobabilitytobetherightoeisgiveabiggerweight,weightsarethecomparedthedisplayigthefialresultso7-SegmetsadLEDs.

ThebackedwasmodelledasaMooreFiiteStateMachiewith4states:(Receivig,CalculatigDistace,DecisioMakig,DisplayigResults).FilesitheBacked:[Voice_Recogitio.vhd,uart_tx.vhd,uart_rx.vhd,uart_parity.vhd,uart.vhd]

DesigChoicesadWorkArouds

EuclideaDistaceCalculatio:Calculatiooftheeuclideadistacefor1000poitlegthvectorisveryexpesivetodoiFPGAdirectlyusigforloops,soIdidalittletrickadcalculatedtheweightsofvectorsidirectly,byolycoutigthestateswherethedistaceequalszero,thisapproachissimilartousigK-earesteighbourimachielearig.Iotherwordswearereallycalculatighammigdistaceiversely.

FFTPoitsDiscardig:DuetotheirrelevaceofallthefrequeciesIolytook1000poitsaddiscardedthewholesigal,alsowhiletakigtheFFTIdiscardedhalfthesigalsduetosymmetryoftheoutput.

MooreFSM:Thedesigwasmadeimooremachieforautomaticrecogitioadtodecreasetheuseriteractiowiththesystem,alsoforcomplexityreductio.

UARTModule:UARTwasusedithemodulefortrasmittigdataduetothelimitatiosoftheFPGABoard,adduetothesimplicityofimplemetatioadavailablitiyofcoversiomodulesithemarket.

ResultsRAMCosumptioaroud380MBoubutu16.04LTSforthefroted.LogicElemetsCosumptiois13,757LE.Cosumes9144Registerad10,450LogicFuctios.Uses46PisfortheUIadDataIterface.Accuracy90%forthesamespeaker,decreaseswithspeakerchagig.Cadetect2Numbers(oeadzero)Coclusio:

ItwasshowherethatitispossibletoimplemetabasicspeechrecogitiosystemoAlteraDE0adit’spossibletoovercomethelimitedcapabilitiesofthehardwarebymaysoftwareworkarouds.

Thesystemisabletosuccessfullyrecogizetwodigits(1ad0)toagreataccuracyforthesamespeaker.Thesystemspeakerdepedettoagreatextetduetothelowumberoftestigsamples,thiscabeimprovedbymakigabiggerdatasetfromvariousspeakers,alsobycalculatigadcomparigtheMFCCswithFFTtheapplicatiowillbemoreeffectiveadwithaveryhighaccuracy.

Theavailabilityofmorepowerfulhardware,willallowmetoeasilyimplemetmorerobustalgorithmslikeHiddeMarkovModelsadusemorepowerfulADCChipstorecordsoudmorepurelyresultigimoreaccurateresults.

功能介绍

FPGA Speech Recognition Simple Speech Recognition System using MATLAB and VHDL on Altera DE0. Dem...

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论