Before we talk about the advantages and disadvantages of MongoDB, it’s important to know what MongoDB is, and get a basic understanding of its features and operation. If you don’t already know what MongoDB is, you might want to start with this post: MongoDB for Beginners – An Introduction to MongoDB, then come back here.
Very briefly, MongoDB is an open source database that uses a data storage schema that can be customised (more on this in the post).
It was released in 2007, and today is one of the most popular non-relational databases, it was built with scalability, maximum availability and good performance in mind, and as in 2022 it is currently at version 5.0.
And I will start asking, what is the difference between relational and non-relational database? A NoSQL database stores data differently than relational databases; instead of storing data in row and column tables, all records in MongoDB databases are documents defined in a binary representation of data called BSON. This information is retrieved by applications in JSON format.
NoSQL means ‘Not only SQL’, and there are many types in it, such as columns, documents, charts, key-value pairs, etc.; MongoDB is of the document type as mentioned above.
In relational databases, developers need to translate tables to the object model to make them suitable for use in the application; however with MongoDB, both the stored data and the object model are of the same BSON structure which is easily translated to JSON.
As mentioned before MongoDB uses documents as a base data store, this allows almost any data structure to be modelled and manipulated easily. MongoDB’s BSON data format, inspired by JSON, allows you to have objects in a collection with different sets of fields, for example if a user is married we should also store the name of the spouse, if the user is single this field simply doesn’t need to exist.
This flexibility is an incredible advantage when dealing with real-world data and changing in business rules or requirements.
Most databases force us to use frameworks, wrappers or heavy-duty tools like ORMs (Object-Relational Mappers) to get data in the form of Objects for use in programs
With MongoDB’s decision to store and represent data in a document format means that you can access it from any language, in data structures native to that language (e.g. Dictionaries in Python, Objects in JavaScript, Maps in Java, etc).
However, you will need a driver (A library that creates a connection between your application and the MongoDB database), you can check the list with all official MongoDB drivers on this link.
In this example below, we can see how you could do to connect and save new records to a MongoDB database using PHP.
<?php
$client = new MongoDB\Client("mongodb://localhost:27017");
$collection = $client->demo->students;
$result = $collection->insertOne( [ 'name' => 'Gustavo', 'group' => '9A' ] );
echo "New student ID: '{$resultado->getInsertedId()}'";
$result = $collection->insertOne( [ 'name' => 'Tomas', 'group' => '9C', 'shift' => 'morning' ] );
echo "New student ID: '{$result->getInsertedId()}'";
?>
In the example, I open a connection to the database called demo
and look for the students
collection, in this collection I insert two new records, and as you notice record 2 has different parameters compared to record 1, which is something totally acceptable with MongoDB.
Thanks to the document model used in MongoDB, information can be embedded within a single document rather than relying on the JOIN
operations of traditional relational databases.
For example, in the traditional database model, we would have a table for our customers, and a separate table with the addresses of those customers.
We can easily reproduce this same model in MongoDB by simply creating two collections, one for clients and one for addresses. And to create the reference between the documents each address would have a parameter that is responsible for making the link, in this case the client_id
.
// client document
{
_id: "72c4hxrt",
name: "Joe Bookreader"
}
// address documents
{
client_id: "72c4hxrt", // reference to the customer's document
street: "123 Rua dos anjos",
city: "São Paulo",
state: "SP",
postal_code: "200128-120"
}
{
client_id: "72c4hxrt",
street: "1 Rua Oliveira ",
city: "Salvador",
state: "BA",
postal_code: "291021-122"
}
With MongoDB we can simplify and unite all the information into just one document in a much simpler way using the Embedded Document Pattern, in this case, aggregating the client addresses into the client document makes the most sense.
// client document
{
_id: "72c4hxrt",
name: "Joe Bookreader",
addresses: [
{
street: "123 Rua dos anjos",
city: "São Paulo",
state: "SP",
postal_code: "200128-120"
},
{
street: "1 Rua Oliveira ",
city: "Salvador",
state: "BA",
postal_code: "291021-122"
}
]
}
This adds up to much faster queries and returns all the necessary information in a single call to the database, see what the new document would look like aggregating the addresses.
When it comes to write performance, MongoDB offers functionality to insert and update multiple records at once with insertMany
and updateMany
. These two functions offer a significant performance increase when compared to batch writes to traditional databases.
For our example, where we created two student records in two separate calls of the insertOne
method, we could have written using the insertMany
method as you can see below.
<?php
$client = new MongoDB\Client("mongodb://localhost:27017");
$collection = $client->demo->students;
$result = $collection->insertMany(
[ 'name' => 'Gustavo', 'group' => '9A' ],
[ 'name' => 'Tomas', 'group' => '9C', 'shift' => 'morning' ]
);
?>
Another big advantage of MongoDB, is that it stores most of the data in RAM instead of the hard disk which allows for faster performance when executing queries. For the fastest processing, ensure that your indexes fit entirely in RAM so that the system can avoid reading the index from disk.
A transactional database is a database that supports ACID (atomicity, consistency, isolation and durability) transactions. A transaction is a set of database read and write operations where all or none of the operations are successful.
Single document transactions have always been atomic in MongoDB. MongoDB added support for multi-document ACID transactions in version 4.0, and expanded that support to include distributed transactions in version 4.2.
The guarantees provided by MongoDB ensure complete isolation while a document is updated; any error causes the operation to be rolled back returning the document unchanged.
With proper modeling transactions that include multiple records are not always necessary. Data in MongoDB, as we saw earlier, can be related and modeled in a single data structure using a variety of types, including sub-documents and arrays, by doing this users will get the same data integrity guarantees as those provided by relational databases.
If you’re used to having to take down your site or application to change the data structure, you’re in luck: MongoDB is designed for change.
Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB collections, by default, do not require your documents to have the same schema.
This means no downtime to change schemas, you can start writing new data with different structures at any time without interrupting your operations.
This flexibility makes it easy to map documents to an entity or object. Each document can be mapped with the fields of an Object, even if the document has substantial variation from other documents in the collection.
Like any software, MongoDB also has its drawbacks. Most of them are actually limitations that can be fixed or improved in the future, but at the moment (2022), some of the points I’m going to mention below may become a problem for those who are thinking about using MongoDB.
The maximum size of the BSON document is 16 megabytes.
The maximum document size helps ensure that a single document cannot use an excessive amount of RAM or, during transmission, an excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API.
MongoDB supports no more than 100 levels of nesting for BSON documents. Each object or array adds one level.
To exemplify this limitation:
{
"_id": 1,
"Universe": {
"Virgo Cluster": [
{
"Virgo Supercluster": {
"Local Group": {
"Milky Way": {
"Solar System": [
{
"Earth": [
{ "Asia": ["Countries", "..."], "Europe": ["Countries", ".."], "America": ["Countries", "..."] }
]
}
]
}
}
}
},
{ "Laniakea Supercluster": "..." }
]
}
}
If MongoDB cannot use an index or indexes to sort the fields in a document, MongoDB initiates a blocking data sort operation.
The name refers to a SORT
operation where all documents in a collection are read to return an output document, in which case the data flow for that particular query must be blocked.
If MongoDB requires the use of more than 100 megabytes of system memory for the sort operation, MongoDB returns an error unless the query specifies cursor.allowDiskUse()
(New in MongoDB 4.4).
allowDiskUse()
allows MongoDB to use temporary disk files to store data that exceeds the 100 megabyte memory limit while processing a sort operation.
db.COLLECTION_NAME.find().sort({KEY:1})
SQL Terms/Concepts | MongoDB Terms/Concepts |
---|---|
database | database |
table | collection |
row | document or BSON document |
column | field |
index | index |
primary_key Specifies a single column or combination of columns as the primary key. | primary_key In MongoDB, the primary key is automatically set to the _id field. |
Leave a Reply