Add background info to the README
This commit is contained in:
parent
3bc9e434ba
commit
81a13d37fb
27
README.md
27
README.md
|
@ -3,11 +3,36 @@
|
|||
Fediverse ebooks bot using neural networks
|
||||
|
||||
|
||||
## Background
|
||||
|
||||
[Text generation programs](https://en.wikipedia.org/wiki/ELIZA) have existed for decades. However, most [ebooks bots](https://github.com/Lynnesbian/mstdn-ebooks) today rely on Markov chains, which are fast but also produce text that sounds like you pulled some random words out of a hat (which isn't entirely inaccurate in this case).
|
||||
|
||||
However, we have another solution. A very overhyped solution, yes. Neural networks! Here are some samples from produced by [this bot](https://social.exozy.me/@ebooks/):
|
||||
|
||||
> I toot. I don't want to. I'm happy with it. I like to make people laugh. The only thing I want is to use it to do a wonderful job.
|
||||
|
||||
> This is total BS, and there really is no difference between this and the Matrix Matrix Matrix Matrix Matrix.
|
||||
|
||||
> Follow me for the next few days. Please remember, I'm sorry, and I'm sorry for the inconvenience.
|
||||
|
||||
> @Gargron @gargron @mattkat @craj_chris I am not a lawyer. I am just the voice of God. I'm a non-profit organization and I can be seen in other ways.
|
||||
|
||||
As you can see, neural networks can generate much more coherent text and learn how to use mentions and hashtags. The caveat? It takes *only* a few hours to train the network, and text generation takes a few seconds, compared to Markov chains which can do that almost instaneously.
|
||||
|
||||
This bot consists of three components:
|
||||
|
||||
- The `data.py` script accesses the fediverse server's database and retrieves all messages to form the training data for the neural network.
|
||||
|
||||
- The `train.py` script downloads a pre-trained [DistilGPT2](https://huggingface.co/distilgpt2) (you can use a larger model like GPT-J if your hardware is powerful enough) and [fine-tunes](https://huggingface.co/docs/transformers/training) it on the training data from the database. By using a pre-trained model, the bot will already have a wide variety of knowledge about topics, including ones not even mentioned in the training data. This step takes a long time but you only have to do it once.
|
||||
|
||||
- The `bot.py` script uses the fine-tuned model to generate text, and posts it to your fediverse server.
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
First, install Python dependencies using your distro's package manager or `pip`: [psycopg2](https://www.psycopg.org), [torch](https://pytorch.org/), [transformers](https://huggingface.co/docs/transformers/index), and [datasets](https://huggingface.co/docs/datasets/). Additionally, for Mastodon and Pleroma, install [Mastodon.py](https://mastodonpy.readthedocs.io/en/stable/), for Misskey, install [Misskey.py](https://misskeypy.readthedocs.io/ja/latest/), and for Matrix, install [simplematrixbotlib](https://simple-matrix-bot-lib.readthedocs.io/en/latest/index.html). If your database or platform isn't supported, don't worry! It's easy to add support for other platforms and databases, and contributions are welcome!
|
||||
|
||||
Now generate the training data from your fediverse server's database using `python data.py -d 'dbname=test user=postgres password=secret host=localhost port=5432'`. Generating the training data from the database is not yet supported for Matrix. You can skip this step if you have collected training data from another source.
|
||||
Now retrieve the training data from your fediverse server's database using `python data.py -d 'dbname=test user=postgres password=secret host=localhost port=5432'`. Retrieving the training data from the database is not yet supported for Matrix. You can skip this step if you have collected training data from another source.
|
||||
|
||||
Next, train the network with `python train.py`, which may take several hours. It's a lot faster when using a GPU. If you need advanced features when training, you can also train using [run_clm.py](https://github.com/huggingface/transformers/blob/master/examples/pytorch/language-modeling/run_clm.py).
|
||||
|
||||
|
|
Loading…
Reference in a new issue