Deeplearningexperimentsforaudioclassification
Afullwrite-up,includingtechnicalexplanationsanddesigndecisions,aswellasasummaryofresultsachievedcanbefoundwithintheassociatedProjectReport.
ThisprojectconsistsofseveralJupyternotebooksthatimplementdeeplearningaudioclassifiers.
1-us8k-ffn-extract-explore.ipynbthisnotebookcontainscodetoextractandvisualiseaudiofilesfromtheUrbanSound8Kdatasetthefeatureextractionprocessusesaudioprocessingmetricsfromthelibrosalibrary,whichreduceseachrecordingto193datapointsastheaudioinformationishighlyabstracted,(wecannotprocesssuccessiveframesusingareceptivefield),thesefeaturesareintendedtobefedintoafeed-forwardneuralnetwork(FFN)2-us8k-ffn-train-predict.ipynbthisnotebookcontainsthecodetoloadpreviouslyextractedfeaturesandfeedthemintoa3-layerFFN,implementedusingTensorflowandKerasalsoincludedissomecodetoevaluatemodelperformance,andtogeneratepredictionsfromindividualsamples,demonstratinghowatrainedmodelwouldbeusedtoidentifythenatureofliverecordings3-us8k-cnn-extract-train.ipynbthisnotebookextractsaudiofeaturessuitableforinputintoaclassic2-layerConvolutionalNeuralNetwork(CNN)muchmoreoftheaudiodataispreservedinthisapproach,asthesavednumpyfeaturedataisover2GBIhaven'tincludeditwiththisrepository,butyoucanusethecodeinthisnotebooktoextractitfromtheoriginalUrbanSound8Kdataset4-us8k-cnn-salamon.ipynbthisnotebookimplementsanalternativeCNN,similartoonedescribedbySalamonandBello5-ffbird-cnn.ipynbthisnotebookusestheSalamonandBelloCNNtoprocesstheFreeField1010datasetoffieldrecordings,withthegoalofrecognisingthepresenceofbirdsong.thedatasetisnotpartofthisrepository,soifyouwanttorunthiscodeyou'llneedtodownloadthedatayourself(seeinstructionsinthenotebook)7-us8k-rnn-extract-train.ipynbthisusesaRecurrentNeuralNetworktoclassifyMel-frequencycepstralcoefficients(MFCC)features.Dogetintouchifyou'veanyquestions,(me@jaroncollis.com)
评论