Default mapping behavior for Spring Data MongoDB

·

6 min read

If you have used Spring Data MongoDB, have you ever thought of what exactly is the default mapping behavior from MongoDB Document to Java POJO? What do I mean by that?

Imagine you have the following Java POJO

@Getter
@Builder
@ToString
@TypeAlias("person")
@Document("person")
public class Person {
    @Id
    private String id;
    private String name;
    private int age;
    private Integer height;
    private boolean isAdult;
    private Address address;
    private List<String> hobbies;
    private List<Address> addresses;
}

public record Address(String street, String postalCode) {}
  • Used lombok Getter, Builder, and ToString annotation
  • Note that I have purposely defined the POJO with various primitive, and non-primitive (reference) data types

What happens when I save the following object using repository.save(person), and some of the values are not set?

And if we flip the scenario, imagine you have the following MongoDB Document

{
    "_id": "63563b8cc056726c1d30c1c3",
    "name": "hello",
    "age": 30,
    "height": 177,
    "isAdult": true,
    "address": {
        "street": "18 New Road",
        "postalCode": "654321"
    },
    "hobbies": ["soccer", "volleyball"],
    "addresses": [
        {
            "street": "17 Old Road",
            "postalCode": "123456"
        }
    ]
}

What happens when I retrieve the following document from MongoDB using repository.findById("63563b8cc056726c1d30c1c3"), and some of the fields does not exist?

In this blog post, I will focus on the default behavior without digging into the internals, and discuss what it means and what I think the default value should be set as.

Assumption

You have a local instance of MongoDB installed, with a username as root and password as password

Setup

For the setup, it will be quite simple and straight-forward. Aside from the POJOs defined above, we only require to configure the Repository and Tests class.

@Repository
public interface PersonRepository extends MongoRepository<Person, String> {

}
@DataMongoTest(excludeAutoConfiguration = EmbeddedMongoAutoConfiguration.class)
class PersonRepositoryCrudTests {
    @Autowired
    private PersonRepository repository;

    @Test
    @Order(1)
    void createEmptyObjectWithBuilder() {
        // to fill up later....
    }
}

Excluded EmbeddedMongoAutoConfiguration as I'm not using embedded MongoDB

Tests

Empty / Null Object

@Test
void createEmptyObjectWithBuilder() {
    Person toCreate = Person.builder().name(null).build();
    Person created = this.repository.save(toCreate);

    Assertions.assertThat(created.getName()).isNull();
    Assertions.assertThat(created.getAge()).isZero();
    Assertions.assertThat(created.getHeight()).isZero();
    Assertions.assertThat(created.isAdult()).isFalse();
    Assertions.assertThat(created.getAddress()).isNull();
    Assertions.assertThat(created.getHobbies()).isEmpty();
    Assertions.assertThat(created.getAddresses()).isEmpty();

    log.info("{}", created);
}

I'm using @Builder pattern from Lombok to create the object, Assertions from AssertJ and @Slf4j annotation from Lombok to call the log statement

The output is

// document in mongodb
{
    "_id" : ObjectId("63564997e88e844484c40d97"),
    "age" : NumberInt(0),
    "isAdult" : false,
    "_class" : "person"
}
// log output
Person(id=6356996013bcce4608b5c0b5, name=null, age=0, height=null, isAdult=false, address=null, hobbies=null, addresses=null)

What do we learn or see from this simple test case?

In terms of saving the document to MongoDB

  • Any field that is not defined (or null) AND is a non-primitive field, will not have a corresponding field in the actual created document
  • Any field that is not defined AND is a primitive field, will have a corresponding field in the actual created document and set to its default value (i.e 0 for int, false for boolean and so on)

In terms of getting the document from MongoDB

  • Any field that does not exist (or null) AND is a non-primitive field, will be set as null
  • Any field that does not exist AND is a primitive field, will be set to its default value (i.e 0 for int, false for boolean and so on)

Impact

What does it mean to not have a field exist in the document, and for a field to be null as a default value?

In MongoDB, it is perfectly fine and acceptable where the field does not exist for a certain document, and that's the beauty of NoSQL with the schemaless design.

In weakly typed language such as javascript, it would be fine too (I suspect). However, for a strongly typed language such as Java, this might be a big issue in my opinion, if not carefully handled. Why is that so? In short, I would have to be extra careful whenever I'm using the field(s) due to possiblilty of encountering NullPointerException.

Recommendation

Sensible Defaults

In order to ensure that the fields are not null by default, we need to define a default value, and it largely depends on your use-case but here are some sensible default that should apply to most use-case.

Primitive

Technically, you don't have to assign any default to Java primitive value, as there already is a default. Unless you want the default to be different, you don't have any do anything.

Non-Primitive (Reference)

In the Person POJO, the default for the various data types should be:

  • String: empty string (or whatever that better suit your use-case)
  • Integer: 0 (or whatever that better suit your use-case)
  • Address: null (or whatever that better suit your use-case)
  • List<String>: empty array
  • List<Address>: empty array

It is particularly challenging to determine the default if it's a class reference data type (i.e Address), because assigning null would incur possible NullPointerException but most of the time, there wouldn't be a sensible default for a class reference data type field.

Use Lombok Builder.Default

Since we are already using Lombok (>= v1.16.16), then using Lombok Builder.Default is a possible solution, where you can specify the default value if not set.

Let's update our Person POJO class to the following

public class Person {
    @Id
    private String id;
    @Builder.Default
    private String name = "";
    private int age;
    @Builder.Default
    private Integer height = 0;
    private boolean isAdult;
    @Builder.Default
    private List<String> hobbies = List.of();
    @Builder.Default
    private List<Address> addresses = List.of();
}

Running back the previous test will produce the following output

Person(id=63565798338016237d70f0b0, name=null, age=0, height=0, isAdult=false, hobbies=[], addresses=[])

Notice that except for name which is still null, the rest of the fields are initialized with a default value. The reason for name being null, is that it was explicitly defined as null during the object creation in the test case.

The downside to this is that this will only be true if you are using the Builder to construct your object. If you were to construct the object via other means (e.g new Person()), this would not have applied. And also to note some of the current pitfall when using Builder.Default with explicit constructor, which is currently tracked at lombok-github-issue.

Conclusion

We tested the default behavior for the mapping between Java POJO and MongoDB document when using Spring Data MongoDB, and understand the pitfall, and how we might overcome it.

However, please proceed with caution, as this is a very contrived example which does not take many other things into consideration such as constructing the object via default or explicit constructor.

If there is one thing to take-away from this, is to know what is the default behavior of the mapping, and that allowing the default value to be null should be avoided as much as possible.

Source Code

As usual, full source code is available in GitHub