Dirty, malformed, and outright mischievous text strings have long been the enemy of interactive website developers. Strings contain any combination of letters, numbers, spaces, and punctuation, and are entered into text boxes on websites by users. These strings in particular can do everything from highlighting XSS vulnerabilities to soliciting 404 error pages.
When looking for vulnerabilities and glitches on your or someone else's website, you could come up with your own text strings to test out, but there are plenty of lists out there that have all that you need already. One such list is called the Big List of Naughty Strings, created and maintained by minimaxir, aka Max Woolf, a Software QA Engineer working in San Fransisco, California.
Minimaxir has been curating this list (at least) as far back as June 2015. The list changes as vulnerabilities are fixed on the server end, and as new ways to cause havoc are discovered. Lists like these are status quo for web developers to check for issues and vulnerabilities on their sites. The lists are also very popular for anyone who wants to stir things up a little by doing much the same. These strings can break websites and worse.
You can find minimaxir's list on his GitHub page, where you can grab it via Git or with a standard HTTP download. You can also just browse the file directly on GitHub. I grabbed the ZIP file by hitting the green "Clone or download" button to the right.
Once I expanded the download on my local filesystem, I was left with a directory called "big-list-of-naughty-strings-master," which contained the following files.
The blns.txt file is the only thing we are interested in here. The remainder of the files deal mostly with acquiring this folder via Git, or using the word list programmatically.
Opening blns.txt in a text editor shows me the words and text strings.
There's far more words and strings than what you see in the above screenshot. This list is 686 lines long, so you'll need to scroll down quite a ways. Comments, set off by #, define the strings by section to save you a bit of time, or better explain the section you're investigating.
This part is so easy that anyone can do it. Literally anyone. All you have to do is visit a website where you have a box to input text and submit one of the text strings. Choose any strings from the list and give it a try; There's lots to choose from—numeric strings, special characters, Unicode symbols, two-byte characters, right-t0-left strings, script injection, and more.
As an example, to check for an easy cross-site scripting (XSS) vulnerability, try a script injection:
You can also check your own site for accurate profanity filtering:
Super Bowl XXX
(Yes, that commonly fails on many site search functions because of the triple-X bit.)
You can also find a site built with Ruby on Rails and see how well it's constructed:
eval("puts 'hello world'")
System("ls -al /")
`ls -al /`
Kernel.exec("ls -al /")
%x('ls -al /')
Reddit caught wind of this list last month, so you might glean something interesting there from those in the know. However, it seems that Reddit itself is locked down pretty well, since someone posted nearly the entire blns.txt file there and nothing happened.
Fact is, it's harder and harder to mess up sites via text input these days. But you never know until you try! Just keep in mind that if you're using these strings against a site which is not your own, you may find yourself blocked. That means the developers are doing their job right.