The provided content is an in-depth guide on Kotlin data classes, including their purpose, features, and best practices for their use within the language.
Abstract
The article introduces Kotlin data classes, which are designed to hold data with minimal boilerplate code. It explains that data classes automatically generate equals()/hashCode(), toString(), componentN(), and copy() functions, simplifying the handling of data-centric objects. The author emphasizes the importance of understanding when to use data classes and when to define custom
Data Classes
An introduction to data classes, what destructuring is, the Pair and Triple classes and a non-obvious reason for why it’s crucial to think hard before you use them
— — — — — — — — — — — — — — —
THE CURRENT VERSION OF THIS ARTICLE IS PUBLISHED HERE.
This article is part of the Kotlin Primer, an opinionated guide to the Kotlin language, which is indented to help facilitate Kotlin adoption inside Java-centric organizations. It was originally written as an organizational learning resource for Etnetera a.s. and I would like to express my sincere gratitude for their support.
As the name suggests, data classes are classes meant to hold data:
The difference between a data class and a normal class is that, with data classes, the compiler automatically derives the following members from all properties declared in the primary constructor:
equals()/hashCode() pair
toString() of the form "User(name=John, age=42)"
componentN() functions corresponding to the properties in their order of declaration, e.g. User("John", 3).component2() == 3. We’ll talk more about componentN() bellow, and again when we discuss operators.
copy() function
You can use the copy() function to copy an object and alter some of its properties while keeping the rest unchanged:
You can exclude properties from the code generation described above by declaring them inside the class body:
Data classes must fulfill the following requirements:
The primary constructor needs to have at least one parameter
All primary constructor parameters need to be marked as val or var
Data classes cannot be abstract, open, sealed, or inner (we’ll talk about sealed classes in a future article)
There are a few more rules regarding data classes, and you can read about them in the docs.
Destructuring
While we haven’t talked about operators yet, they enable an important property of data classes that is worth mentioning right away — data class instances can be destructured:
When we talk about operators, we will see that this ability is shared by all classes that implement the componentN operators, and is not specific to data classes.
Standard data classes
The standard library provides the Pair and Triple classes. These can be useful when you need a single-shot data structure for some purpose, but they are easy to overuse — always consider whether you should define an actual class.
The exercise at the end of the article demonstrates why it’s not a good idea to overuse them. The exercise is trivial, but it contains a very important message that pertains directly to what was discussed in the Preface — creating explicit classes can be the difference between a runtime error and a compile time error (as you will see in the exercise).
A good rule of thumb is “does the structure I’m creating have a human name or is it a specific ‘thing’ in the business context it appears in?”. If the answer is “yes”, then you should definitely create a separate, named class. Type-aliases (discussed in a future article) don’t count! They won’t save you from runtime errors.
An example of something that shouldn’t definitely have its own class are the Vector and Point classes in the exercise bellow, an example of something where it might not be necessary could be “Sales reps associated to the amount of money they have brought in”. If you’re in the context of a method that determines which sales rep made the most money for the company, then “sales reps associated to the amount of money they have brought in” is probably just an intermediate data structure which is then traversed to find the maximum, and thrown away — in this case, using Pair is completely fine. However, if you are in the business context of “the customer wants to send an e-mail report containing a list of sales reps associated to the amount of money they have brought in”, then it becomes a clear business concept, and should have its own appropriately-named class, e.g. data class SalesRepReportData(val rep: User, val amountMade: ValueUSD). It might be tempting to use Map<User, ValueUSD> but this is wrong, again for the same reasons as stated above.
Finally, while we haven’t yet discussed infix functions, we will mention that Pair offers the to infix function that can be used to construct Pair instances elegantly: