Consuming Multiple Archives Into A Single Model For A twitter_ebooks Bot

Published: May 25, 2017


Recently, I launched my own ebooks bot.

If you read the twitter_ebooks README, you’ll see that you can use the command ebooks consume to generate a text model for the bot to work from based on a JSON archive of tweets, or a plain text file.

This is nice, but one question I had was, can I build my text model from multiple sources?

It’s not currently documented in the README, but it turns out you can. To do so, you use the ebooks consume-all command.

The signature is as follows…

ebooks consume-all <model_name> <corpus_path> [corpus_path2] [...]

You’ll see consume-all mentioned in the usage string if you run ebooks with no arguments.

$ ebooks
     ebooks help <command>

     ebooks new <reponame>
     ebooks s[tart]
     ebooks c[onsole]
     ebooks auth
     ebooks consume <corpus_path> [corpus_path2] [...]
     ebooks consume-all <model_name> <corpus_path> [corpus_path2] [...]
     ebooks append <model_name> <corpus_path>
     ebooks gen <model_path> [input]
     ebooks archive <username> [path]
     ebooks tweet <model_path> <botname>
     ebooks version

While this is helpful, ideally, I think this feature should be documented in the README.

I submitted a PR to do just that here. However, the project hasn’t been updated in a while, so I’m not sure if / when it will be merged.


I hope you found this post helpful. If you have any questions or comments, feel free to drop a note below, or, as always, you can reach me on Twitter as well.

:bulb:Did you enjoy this blog post?

If so, please consider checking out my side project Domain Clamp. It's a SaaS which monitors domains and SSL certificates and sends notifications before anything expires. If you work at an agency, then you're probably not the registrant for your client's domains or the SSL certificate owner. This means you won't get expiration notifications. You don't want a client's domain or SSL certificate to expire under your watch. Believe me, I've been there.

Domain Clamp solves this problem by letting you monitor the SSL certificate and registration for any domain you'd damn please. Free accounts are available so please head on over »