Automated Scambaiting University Project

jjj333_p · 18 December 2024 16:50

Edit: more context and the current state of the project is in a reply below

Heya, hopefully this is under the right tag. For my Natural Language Processing course I managed to swing experimenting with different ways of tuning system prompts of Llama3.2 for scambaiting. its already working and somewhat effective.

The code is here: https://github.com/jjj333-p/llama-scambait-experiment although it is not presentable yet (im like cramming to the presentation and submission date rn) but if you can send scammers to [email protected] that would be amazing, or if you forward the email there in a client like thunderbird you can change the reply-to address which the bot will listen to.

I also have loose plans to try to publish the email logs once im able to, perhaps ill just throw together a python script to turn the JSON files i have into markdown i can just post in this thread.

Any participation is much appreciated.

jjj333_p · 20 December 2024 09:14

Heya all, heres a formatted output of the current scambaiting history (all conversations of length greater than 2, i.e. initial email, ai response, scammer response): main/latest_output.md

For the “is” or “is not” using the “Edited system prompt” the crux of the experiment and how I got my professor to sign off on it is basically im testing if a LLM can be fed a chat history and a system prompt and it can write a better one better suited to usage with that task. Basically trying to finetune an LLM for scambaiting without the processing or effort required to actually finetune a Large Language Model.

I probably won’t publicly release the research paper I am writing on it until its somehow revolutionary as it has my real identity all over it and I can just informally put my results here. So far I haven’t been able to do enough a/b testing on just using a static system prompt as enough time hasn’t passed but I can say this much: The LLM seems definitely capable of outputting a better system prompt that to my eyes seems to be much more engaging. I cannot tell if this makes a meaningful impact to scammers. If chat history becomes too long it seems to overpower the system prompt and it just starts responding as if it was apart of the conversation, and even with just a couple messages I have to repeat several times that its supposed to output a new system prompt to feed back in. As you can see at least once it switched to giving a system prompt that is just an email template lmao. Its also a huge pain to parse its output because if I tell it to output just the system prompt or any kinda description of desired output it will get destracted and either complain about helping a scam or just respond as if it was in the conversation. The properties of an LLM seen as desirable (i.e. non-reproducability) make it a huge pain to parse back in.

For what it is worth I am using Llama3.1 7 or 8 billion parameters (whatever the base one is) in cpu only mode, I do not know if other models would perform better. Llama is just the most fit to run on the hardware im working with and I dont at the moment have the means to pay for ChatGPT 4o (cant imagine 4o-mini would do much better, and its still costing something)