regex for Twitter username
Question:
Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
Answers:
If you’re talking about the @username
thing they use on twitter, then you can use this:
import re
twitter_username_re = re.compile(r'@([A-Za-z0-9_]+)')
To make every instance an HTML link, you could do something like this:
my_html_str = twitter_username_re.sub(lambda m: '<a href="http://twitter.com/%s">%s</a>' % (m.group(1), m.group(0)), my_tweet)
The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'@(?i)[a-z0-9_]+'
to match everything correctly and also discern between users.
Twitter recently released to open source in various languages including Java, Ruby (gem) and Javascript implementations of the code they use for finding user names, hash tags, lists and urls.
It is very regular expression oriented.
Shorter, /@([w]+)/
works fine.
(?<=^|(?<=[^a-zA-Z0-9-_.]))@([A-Za-z]+[A-Za-z0-9-_]+)
I’ve used this as it disregards emails.
Here is a sample tweet:
@Hello how are @you doing @my_friend, email @000 me @ [email protected] @shahmirj
Matches:
- @Hello
- @you
- @my_friend
- @shahmirj
It will also work for hashtags, I use the same expression with the @
changed to #
.
The regex I use, and that have been tested in multiple contexts :
/(^|[^@w])@(w{1,15})b/
This is the cleanest way I’ve found to test and replace Twitter username in strings.
#!/usr/bin/python
import re
text = "@RayFranco is answering to @jjconti, this is a real '@username83' but this is [email protected], and this is a @probablyfaketwitterusername";
ftext = re.sub( r'(^|[^@w])@(w{1,15})b', '\1<a href="http://twitter.com/\2">\2</a>', text )
print ftext;
This will return me as expected :
<a href="http://twitter.com/RayFranco">RayFranco</a> is answering to <a href="http://twitter.com/jjconti">jjconti</a>, this is a real '<a href="http://twitter.com/username83">username83</a>' but this is [email protected], and this is a @probablyfaketwitterusername
Based on Twitter specs :
Your username cannot be longer than 15 characters. Your real name can be longer (20 characters), but usernames are kept shorter for the sake of ease.
A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn’t contain any symbols, dashes, or spaces.
This is a method I have used in a project that takes the text attribute of a tweet object and returns the text with both the hashtags and user_mentions linked to their appropriate pages on twitter, complying with the most recent twitter display guidelines
def link_tweet(tweet):
"""
This method takes the text attribute from a tweet object and returns it with
user_mentions and hashtags linked
"""
tweet = re.sub(r'(A|s)@(w+)', r'1@<a href="http://www.twitter.com/2">2</a>', str(tweet))
return re.sub(r'(A|s)#(w+)', r'1#<a href="http://search.twitter.com/search?q=%232">2</a>', str(tweet))
Once you call this method you can pass in the param my_tweet[x].text. Hope this is helpful.
This regex seems to solve Twitter usernames:
^@[A-Za-z0-9_]{1,15}$
Max 15 characters, allows underscores directly after the @, (which Twitter does), and allows all underscores (which, after a quick search, I found that Twitter apparently also does). Excludes email addresses.
In case you need to match all the handle
, @handle
and twitter.com/handle
formats, this is a variation:
import re
match = re.search(r'^(?:.*twitter.com/|@?)(w{1,15})(?:$|/.*$)', text)
handle = match.group(1)
Explanation, examples and working regex here:
https://regex101.com/r/7KbhqA/3
Matched
myhandle
@myhandle
@my_handle_2
twitter.com/myhandle
Tweets by MyHandle
https://twitter.com/myhandle/randomstuff
Not matched
mysuperhandleistoolong
@mysuperhandleistoolong
https://twitter.com/mysuperhandleistoolong
You can use the following regex: ^@[A-Za-z0-9_]{1,15}$
In python:
import re
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
pattern.match('@Your_handle')
This will check if the string exactly matches the regex.
In a ‘practical’ setting, you could use it as follows:
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
if pattern.match('@Your_handle'):
print('Match')
else:
print('No Match')
I have used the existing answers and modified it for my use case. (username must be longer then 4 characters)
^[A-z0-9_]{5,15}$
Rules:
- Your username must be longer than 4 characters.
- Your username must be shorter than 15 characters.
- Your username can only contain letters, numbers and ‘_’.
Source: https://help.twitter.com/en/managing-your-account/twitter-username-rules
Could you provide a regex that match Twitter usernames?
Extra bonus if a Python example is provided.
If you’re talking about the @username
thing they use on twitter, then you can use this:
import re
twitter_username_re = re.compile(r'@([A-Za-z0-9_]+)')
To make every instance an HTML link, you could do something like this:
my_html_str = twitter_username_re.sub(lambda m: '<a href="http://twitter.com/%s">%s</a>' % (m.group(1), m.group(0)), my_tweet)
The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'@(?i)[a-z0-9_]+'
to match everything correctly and also discern between users.
Twitter recently released to open source in various languages including Java, Ruby (gem) and Javascript implementations of the code they use for finding user names, hash tags, lists and urls.
It is very regular expression oriented.
Shorter, /@([w]+)/
works fine.
(?<=^|(?<=[^a-zA-Z0-9-_.]))@([A-Za-z]+[A-Za-z0-9-_]+)
I’ve used this as it disregards emails.
Here is a sample tweet:
@Hello how are @you doing @my_friend, email @000 me @ [email protected] @shahmirj
Matches:
- @Hello
- @you
- @my_friend
- @shahmirj
It will also work for hashtags, I use the same expression with the @
changed to #
.
The regex I use, and that have been tested in multiple contexts :
/(^|[^@w])@(w{1,15})b/
This is the cleanest way I’ve found to test and replace Twitter username in strings.
#!/usr/bin/python
import re
text = "@RayFranco is answering to @jjconti, this is a real '@username83' but this is [email protected], and this is a @probablyfaketwitterusername";
ftext = re.sub( r'(^|[^@w])@(w{1,15})b', '\1<a href="http://twitter.com/\2">\2</a>', text )
print ftext;
This will return me as expected :
<a href="http://twitter.com/RayFranco">RayFranco</a> is answering to <a href="http://twitter.com/jjconti">jjconti</a>, this is a real '<a href="http://twitter.com/username83">username83</a>' but this is [email protected], and this is a @probablyfaketwitterusername
Based on Twitter specs :
Your username cannot be longer than 15 characters. Your real name can be longer (20 characters), but usernames are kept shorter for the sake of ease.
A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn’t contain any symbols, dashes, or spaces.
This is a method I have used in a project that takes the text attribute of a tweet object and returns the text with both the hashtags and user_mentions linked to their appropriate pages on twitter, complying with the most recent twitter display guidelines
def link_tweet(tweet):
"""
This method takes the text attribute from a tweet object and returns it with
user_mentions and hashtags linked
"""
tweet = re.sub(r'(A|s)@(w+)', r'1@<a href="http://www.twitter.com/2">2</a>', str(tweet))
return re.sub(r'(A|s)#(w+)', r'1#<a href="http://search.twitter.com/search?q=%232">2</a>', str(tweet))
Once you call this method you can pass in the param my_tweet[x].text. Hope this is helpful.
This regex seems to solve Twitter usernames:
^@[A-Za-z0-9_]{1,15}$
Max 15 characters, allows underscores directly after the @, (which Twitter does), and allows all underscores (which, after a quick search, I found that Twitter apparently also does). Excludes email addresses.
In case you need to match all the handle
, @handle
and twitter.com/handle
formats, this is a variation:
import re
match = re.search(r'^(?:.*twitter.com/|@?)(w{1,15})(?:$|/.*$)', text)
handle = match.group(1)
Explanation, examples and working regex here:
https://regex101.com/r/7KbhqA/3
Matched
myhandle
@myhandle
@my_handle_2
twitter.com/myhandle
Tweets by MyHandle
https://twitter.com/myhandle/randomstuff
Not matched
mysuperhandleistoolong
@mysuperhandleistoolong
https://twitter.com/mysuperhandleistoolong
You can use the following regex: ^@[A-Za-z0-9_]{1,15}$
In python:
import re
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
pattern.match('@Your_handle')
This will check if the string exactly matches the regex.
In a ‘practical’ setting, you could use it as follows:
pattern = re.compile('^@[A-Za-z0-9_]{1,15}$')
if pattern.match('@Your_handle'):
print('Match')
else:
print('No Match')
I have used the existing answers and modified it for my use case. (username must be longer then 4 characters)
^[A-z0-9_]{5,15}$
Rules:
- Your username must be longer than 4 characters.
- Your username must be shorter than 15 characters.
- Your username can only contain letters, numbers and ‘_’.
Source: https://help.twitter.com/en/managing-your-account/twitter-username-rules