When should I be using classes in Python?

Question:

I have been programming in python for about two years; mostly data stuff (pandas, mpl, numpy), but also automation scripts and small web apps. I’m trying to become a better programmer and increase my python knowledge and one of the things that bothers me is that I have never used a class (outside of copying random flask code for small web apps). I generally understand what they are, but I can’t seem to wrap my head around why I would need them over a simple function.

To add specificity to my question: I write tons of automated reports which always involve pulling data from multiple data sources (mongo, sql, postgres, apis), performing a lot or a little data munging and formatting, writing the data to csv/excel/html, send it out in an email. The scripts range from ~250 lines to ~600 lines. Would there be any reason for me to use classes to do this and why?

Asked By: metersk

||

Answers:

I think you do it right. Classes are reasonable when you need to simulate some business logic or difficult real-life processes with difficult relations.
As example:

  • Several functions with share state
  • More than one copy of the same state variables
  • To extend the behavior of an existing functionality

I also suggest you to watch this classic video

Answered By: valignatev

Classes are the pillar of Object Oriented Programming. OOP is highly concerned with code organization, reusability, and encapsulation.

First, a disclaimer: OOP is partially in contrast to Functional Programming, which is a different paradigm used a lot in Python. Not everyone who programs in Python (or surely most languages) uses OOP. You can do a lot in Java 8 that isn’t very Object Oriented. If you don’t want to use OOP, then don’t. If you’re just writing one-off scripts to process data that you’ll never use again, then keep writing the way you are.

However, there are a lot of reasons to use OOP.

Some reasons:

  • Organization:
    OOP defines well known and standard ways of describing and defining both data and procedure in code. Both data and procedure can be stored at varying levels of definition (in different classes), and there are standard ways about talking about these definitions. That is, if you use OOP in a standard way, it will help your later self and others understand, edit, and use your code. Also, instead of using a complex, arbitrary data storage mechanism (dicts of dicts or lists or dicts or lists of dicts of sets, or whatever), you can name pieces of data structures and conveniently refer to them.

  • State: OOP helps you define and keep track of state. For instance, in a classic example, if you’re creating a program that processes students (for instance, a grade program), you can keep all the info you need about them in one spot (name, age, gender, grade level, courses, grades, teachers, peers, diet, special needs, etc.), and this data is persisted as long as the object is alive, and is easily accessible. In contrast, in pure functional programming, state is never mutated in place.

  • Encapsulation:
    With encapsulation, procedure and data are stored together. Methods (an OOP term for functions) are defined right alongside the data that they operate on and produce. In a language like Java that allows for access control, or in Python, depending upon how you describe your public API, this means that methods and data can be hidden from the user. What this means is that if you need or want to change code, you can do whatever you want to the implementation of the code, but keep the public APIs the same.

  • Inheritance:
    Inheritance allows you to define data and procedure in one place (in one class), and then override or extend that functionality later. For instance, in Python, I often see people creating subclasses of the dict class in order to add additional functionality. A common change is overriding the method that throws an exception when a key is requested from a dictionary that doesn’t exist to give a default value based on an unknown key. This allows you to extend your own code now or later, allow others to extend your code, and allows you to extend other people’s code.

  • Reusability: All of these reasons and others allow for greater reusability of code. Object oriented code allows you to write solid (tested) code once, and then reuse over and over. If you need to tweak something for your specific use case, you can inherit from an existing class and overwrite the existing behavior. If you need to change something, you can change it all while maintaining the existing public method signatures, and no one is the wiser (hopefully).

Again, there are several reasons not to use OOP, and you don’t need to. But luckily with a language like Python, you can use just a little bit or a lot, it’s up to you.

An example of the student use case (no guarantee on code quality, just an example):

Object Oriented

class Student(object):
    def __init__(self, name, age, gender, level, grades=None):
        self.name = name
        self.age = age
        self.gender = gender
        self.level = level
        self.grades = grades or {}

    def setGrade(self, course, grade):
        self.grades[course] = grade

    def getGrade(self, course):
        return self.grades[course]

    def getGPA(self):
        return sum(self.grades.values())/len(self.grades)

# Define some students
john = Student("John", 12, "male", 6, {"math":3.3})
jane = Student("Jane", 12, "female", 6, {"math":3.5})

# Now we can get to the grades easily
print(john.getGPA())
print(jane.getGPA())

Standard Dict

def calculateGPA(gradeDict):
    return sum(gradeDict.values())/len(gradeDict)

students = {}
# We can set the keys to variables so we might minimize typos
name, age, gender, level, grades = "name", "age", "gender", "level", "grades"
john, jane = "john", "jane"
math = "math"
students[john] = {}
students[john][age] = 12
students[john][gender] = "male"
students[john][level] = 6
students[john][grades] = {math:3.3}

students[jane] = {}
students[jane][age] = 12
students[jane][gender] = "female"
students[jane][level] = 6
students[jane][grades] = {math:3.5}

# At this point, we need to remember who the students are and where the grades are stored. Not a huge deal, but avoided by OOP.
print(calculateGPA(students[john][grades]))
print(calculateGPA(students[jane][grades]))
Answered By: dantiston

A class defines a real world entity. If you are working on something that exists individually and has its own logic that is separate from others, you should create a class for it. For example, a class that encapsulates database connectivity.

If this not the case, no need to create class

Answered By: Ashutosh

Whenever you need to maintain a state of your functions and it cannot be accomplished with generators (functions which yield rather than return). Generators maintain their own state.

If you want to override any of the standard operators, you need a class.

Whenever you have a use for a Visitor pattern, you’ll need classes. Every other design pattern can be accomplished more effectively and cleanly with generators, context managers (which are also better implemented as generators than as classes) and POD types (dictionaries, lists and tuples, etc.).

If you want to write “pythonic” code, you should prefer context managers and generators over classes. It will be cleaner.

If you want to extend functionality, you will almost always be able to accomplish it with containment rather than inheritance.

As every rule, this has an exception. If you want to encapsulate functionality quickly (ie, write test code rather than library-level reusable code), you can encapsulate the state in a class. It will be simple and won’t need to be reusable.

If you need a C++ style destructor (RIIA), you definitely do NOT want to use classes. You want context managers.

Answered By: Dmitry Rubanovich

It depends on your idea and design. If you are a good designer, then OOPs will come out naturally in the form of various design patterns.

For simple script-level processing, OOPs can be overhead.

Simply consider the basic benefits of OOPs like reusability and extendability and make sure if they are needed or not.

OOPs make complex things simpler and simpler things complex.

Simply keep the things simple in either way using OOPs or not using OOPs. Whichever is simpler, use that.

Answered By: Mohit Thakur

dantiston gives a great answer on why OOP can be useful. However, it is worth noting that OOP is not necessary a better choice most cases it is used. OOP has the advantage of combining data and methods together. In terms of application, I would say that use OOP only if all the functions/methods are dealing and only dealing with a particular set of data and nothing else.

Consider a functional programming refactoring of dentiston’s example:

def dictMean( nums ):
    return sum(nums.values())/len(nums)
# It's good to include automatic tests for production code, to ensure that updates don't break old codes
assert( dictMean({'math':3.3,'science':3.5})==3.4 )

john = {'name':'John', 'age':12, 'gender':'male', 'level':6, 'grades':{'math':3.3}}

# setGrade
john['grades']['science']=3.5

# getGrade
print(john['grades']['math'])

# getGPA
print(dictMean(john['grades']))

At a first look, it seems like all the 3 methods exclusively deal with GPA, until you realize that Student.getGPA() can be generalized as a function to compute mean of a dict, and re-used on other problems, and the other 2 methods reinvent what dict can already do.

The functional implementation gains:

  1. Simplicity. No boilerplate class or selfs.
  2. Easily add automatic test code right after each
    function for easy maintenance.
  3. Easily split into several programs as your code scales.
  4. Reusability for purposes other than computing GPA.

The functional implementation loses:

  1. Typing in 'name', 'age', 'gender' in dict key each time is not very DRY (don’t repeat yourself). It’s possible to avoid that by changing dict to a list. Sure, a list is less clear than a dict, but this is a none issue if you include an automatic test code below anyway.

Issues this example doesn’t cover:

  1. OOP inheritance can be supplanted by function callback.
  2. Calling an OOP class has to create an instance of it first. This can be boring when you don’t have data in __init__(self).
Answered By: zyc
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.