Generating email address using first name and last name in Faker python

Question:

I am trying to generate a pandas dataset comprising person data. I am employing Python’s Faker library. Is there a way to generate a valid email address using the first name and last name?

import pandas as pd
import numpy as np
import os
import random
from faker import Faker

def faker_categorical(num=1, seed=None):
    np.random.seed(seed)
    fake.seed_instance(seed)
    output = []
    for x in range(num):
      gender = np.random.choice(["M", "F"], p=[0.5, 0.5])
      output.append(
        {
            "First name": fake.first_name_male() if gender=="M" else  
                                                 fake.first_name_female(),
            "Last name": fake.last_name(),
            "E-mail": fake.ascii_email(),  
        })
    return output
Asked By: Nanda

||

Answers:

You can use Faker’s domain_name method and string formatting alongside the already generated values:

first_name = fake.first_name_male() if gender =="M" else fake.first_name_female()
last_name = fake.last_name()

output.append(
    {
     "First name": first_name,
     "Last Name": last_name,
     "E-mail": f"{first_name}.{last_name}@{fake.domain_name()}"
    }
)

On a more complete approach, you could add factoryboy to the mix:

from factory import DictFactory, LazyAttribute
from factory.fuzzy import FuzzyChoice
from factory import Faker

class PersonDataFactory(DictFactory):

    first = LazyAttribute(lambda obj: fake.first_name_male() if obj._gender == "M" else fake.first_name_female())
    last = Faker("last_name")
    email = LazyAttribute(lambda obj: f"{obj.first}.{obj.last}@{fake.domain_name()}")
    _gender = FuzzyChoice(("M", "F"))

    class Meta:
        exclude = ("_gender",)
        rename = {"first": "First Name", "last": "Last Name", "email": "E-mail"}


PersonDataFactory()

which will result in something like:

{'First Name': 'Albert',
 'Last Name': 'Martinez',
 'E-mail': '[email protected]'}
Answered By: hectorcanto

I’ll show an alternative to the accepted answer if you need to build more complex data. This solution relies on the use of data provider that are part of the Faker library. You define a new Provider and then you add to your instance of Faker(). You can then call this generator where ever you need.

from faker.providers import BaseProvider

class CustomProvider(BaseProvider):
    __provider__ = "personalia"
    
    def personalia(self):
        gender = self.random_element(["F", "M"])
        first_name = self.generator.first_name_male() if gender == "M" else self.generator.first_name_female()
        last_name = self.generator.last_name()
        email_address = f"{first_name.lower()}.{last_name.lower()}@{self.generator.domain_name()}"
        
        return {
          "First name": first_name,
          "Last Name": last_name,
          "E-mail": email_address
        }
        
fake = Faker()
fake.add_provider(CustomProvider)

personalia = fake.personalia()
print(personalia)

The output should look like this:

{
 'First name': 'Olivia',
 'Last Name': 'Cook',
 'E-mail': '[email protected]'
}

Of course this is just an simple example based on the code you have provided. 😉

Answered By: thoroc
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.