People bypassing anti-swear system by using duplicate letters

Question:

Ok so, I made an anti-swear system in discord.py, but you can easily bypass it. Here’s an example: let’s say "cat" is a swear word, well you can just do "ccaatt" and the bot won’t detect it. How do I fix this?
Here’s the code:

@client.event
async def on_message(message):
    try:
      if message.author.bot:
        return
      if client.user in message.mentions:
          response = f"Hello! I'm Stealth Bot. My prefix is `-`. To see a list of commands do `-help`."
          await message.channel.send(response)
          print(f"{message.author.name} pinged me in {message.channel}!")
          pass
      if message.channel.id == 828667602351161354 or message.channel.id == 820049182860509206:
        pass
      else:
        if profanity.contains_profanity(message.content):
            await message.delete()
            warnMessage = f"Hey {message.author.mention}! Don't say that!n*You said ||{message.content}||*"
            await message.channel.send(warnMessage, delete_after=5.0)
            print(f"{message.author.name} tried saying: {message.content}")
            channel = client.get_channel(836232733126426666)

            embed = discord.Embed(title=f"Someone tried to swear!", colour=0x2D2D2D)
            embed.add_field(name="Person who tried to swear:", value=f"{message.author.name}", inline=False)
            embed.add_field(name="What they tried to say:", value=f"{message.content}", inline=False)
            embed.add_field(name="Debug:", value=f"{profanity}", inline=False)
            embed.add_field(name="Debug 2:", value=f"{message.content.lower}", inline=False)
            embed.add_field(name="Channel they tried to swear in:", value=f"<#{message.channel.id}>", inline=False)

            await channel.send(embed=embed)
            pass
      await client.process_commands(message)
    except Exception as e:
        print(e)
Asked By: Ender 2K89

||

Answers:

You could try something like this

def replaceDoubleCharacters(string):
    lastLetter, replacedString = "", ""
    for letter in string:
        if letter != lastLetter:
            replacedString += letter
        lastLetter = letter
    return replacedString

# your code
if profanity.contains_profanity(message.content) or profanity.contains_profanity(replaceDoubleCharacters(message.content)):
    await message.delete()
# more of your code

replaceDoubleCharacters does exactly what it says. But you should keep in mind that people trying to bypass restrictions is a very common thing, and they will find other ways.

Answered By: itzFlubby

The way I would do it, is to use the difflib like this:

import difflib

swearWords = ["cat", "dog"]
swearWordsFound=difflib.get_close_matches("stringToCheck", swearWords)
Answered By: loloToster

To be honest a proper discord Mod using discord.py require Machine Learning and AI no matter how much u format the messages there will be many flaws there was a competition on kaggle to rate toxic messages. The 1st place solution was transformed into a module called detoxify in python if you actually want a very good anti swear and have a good compute and a bit of storage on the cloud pc you are deploying on then I think you can use it. I would recommend using heroku it is widely used for deploying ML applications.

Answered By: SAYANTAN MAZUMDAR

lets say you are using your own discord bot just do this:

if message.contains(‘swear words here’)

(Only if your bot is in python and add 2 underscores before and after the contains function)

Answered By: Substics
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.