DankensteinisaMarkovChainTwitterBotgenerator,basedonmakingamashupofdifferentcorpora.
Itisdescribedinthisblogpost.
DependenciesThesoftwareitselfisimplementedusingBash,Make,andPython.
You'llneedaTwitterAccountandaTwitterapplication.Fromthelatter,you'llneedaconsumerkey,aconsumersecret,anaccesstokenandanaccesstokensecret.
EntertheseintotwitterCredentials.sh
Togeneratethesupporteddatasetsyou'llneed:
PopplerbrewinstallpoppleronmacOSsudoapt-getinstall-ypoppler-utilsonUbuntuAfterthat,youcanruntheimportEnv.shscriptinthedev-folder.ThiswillcreateaPythonvirtualenvironment(condaifyouhaveit,virtualenvotherwise)calleddankenstein,andinstallallpython-dependencies.
Ifyou'dratherinstallthemseparately,theyare:Tweepy,darklyrics,wikiquote,markovify;(optionallynltk)
UsageUsageisbasedaroundMake.Thefollowingoptionsareavailable:
makecorpora:Buildsallcorporamakelist:Listsallavailablecorporamakecombinations:Printsallpossiblecombinationsoftwocandidates-mayincludeduplicates(withswitchedpositions)makemodelARGS="corpus1corpus2[(scale1scale2)stateSizeoverlapTotaloverlapRatiotriessentencesmodelComplexity]":Generatesamodelbasedontwocorpora(theonlyrequiredargs).Ifyoudefinescale1,youmustdefinescale2.Alloptionsinsidethesquarebracketsareoptional.ARGS:corpus1string-thenameofthefirstcorpusnodefaultcorpus2string-thenameofthesecondcorpusnodefaultscale1float/int-scaleofcorpus1relativetocorpus2defaultsto1scale2float/int-scaleofcorpus2relativetocorpus1defaultsto1stateSizeint-statesizeoftheMarkovchaindefaultsto2overlapTotalint-maximumsequentialwordsthatoverlapwithasentencefromthecorporadefaultsto15overlapRatioint-maximumpercentageofoverlapwithasentencefromthecorporadefaultsto70triesint-attemptstomakeanoriginalsentencedefaultsto10sentencesint-numberofsentencesdefaultsto5modelComplexitystring-naive|expertdefaultsto"naive"EXAMPLE:makemodelARGS="tcmjobs21215301025expert"&&makesentencesmakesentence:Outputsonesentencebasedonanavailable(previouslytrained)model.makesentences:Outputstheamountofsentencesgivenasanargumentduringmodelgeneration.makeclean:Deletesallcorpora.maketweet:Postsatweet.Itispossibletoseallcombinationsforagivencandidate,e.g.makecombinations|grep"picard+".
Deployment:Twittersecretsaresetviaenvironmentvariables,e.g.
exportCONSUMER_KEY="consumer_key"exportCONSUMER_SECRET="consumer_secret"exportACCESS_KEY="access_token"exportACCESS_SECRET="access_token_secret"Anexampledeploymentisdescribedinthebottomofthisblogpost.
评论