MongoDB basics and CRUD

These are the notes I had taken while I was reading from a book, MongoDB The Definative Guide. If you want to learn mongoDB, there are several free online courses on Mongo university which go from the very basics to advanced concepts of database design.


A document is a ordered set of key, value pairs.


Group related types of document together, even though MongoDB doesn’t enforce it



A database has its own permissions, and each database is stored in separate files on disk. A good rule of thumb is to store all data for a single application in the same database. Separate databases are useful when storing data for several application or users on the same MongoDB server.

CRUD Operations

  1. Create

    .insert() operation can be used to inset a document in a collection

  2. Read

    find() and findOne() can be used to read several documents and one document respectively

  3. Update


  4. Delete


Data Types

  1. Date

    • use new Date() to create a date object. not using new and calling Date() returns a string. See JS’s ECMA Specification for how Date class works*
    • Dates in db are stored as milliseconds since the epoch, they do not have local time zone settings associated with them.
    • Time zone information can be stored as value for another key
  2. Arrays

    • [] in JSON represents an array*

    {"cars": ["Tesla","Ford","Lamborghini"]} {"things": ["pie",3.14,true,1,null]}

    • Arrays can hold different data types as values, in above example string, number, boolean, null. *

    One of the great things about arrays in documents is that MongoDB “understands” their structure and knows how to reach inside of arrays to perform operations on their contents. This allows us to query on arrays and build indexes using their contents. For instance, in the previous example, MongoDB can query for all documents where 3.14 is an element of the “things” array. If this is a common query, you can even create an index on the “things” key to improve the query’s speed.

  3. Embedded Documents

    • Document can be used as a value for the key. This is called an embedded document.
    • ED can be used to organize data in a more natural way
    • Embedded documents are not JS objects, they support all the data types supported by the document as discussed above*
    • As with arrays, MongoDB “understands” the structure of embedded documents and is able to reach inside them to build indexes, perform queries, or make updates

    In a relational database, the previous document would probably be modeled as two separate rows in two different tables (one for “people” and one for “addresses”). With MongoDB we can embed the address document directly within the person document. When used properly, embedded documents can provide a more natural representation of information.

    • The flip side to this is that there will be more data repetition with MongoDB*
    • Suppose “addresses” were a separate table in a relational database and we needed to fix a typo in an address. When we did a join with “people” and “addresses,” we’d get the updated address for everyone who shares it. With MongoDB, we’d need to fix the typo in each person’s document. *
    1. _id and ObjectIds

      • Every document stored in MongoDB must have an _id key. The _id key’s value can be of any type, but it defaults to an ObjectId
      • Every document must have unique _id within a collection*

      ObjectId is the default type for “_id”. The ObjectId class is designed to be lightweight, while still being easy to generate in a globally unique way across different machines. MongoDB’s distributed nature is the main reason why it uses ObjectIds as opposed to something more traditional, like an auto incrementing primary key: it is difficult and time-consuming to synchronize auto-incrementing primary keys across multiple servers. because MongoDB was designed to be a distributed database, it was important to be able to generate unique identifiers in a shared environment.

      ObjectId use 12 bytes of storage, which gives them a string representation of 24 hexadecimal digits: 2 digits for each byte.

      • The first four bytes of an ObjectId are a timestamps in seconds since the epoch. Because the time-stamp comes first, it means that ObjectID’s will sort in roughly insertion order. This is not a strong guarantee but does have some nice properties, such as making ObjectIds efficient to index.*
      • The next three bytes of an ObjectId are a unique identifier of the machine on which it was generated. This is usually a hash of the machine’s hostname. By including these bytes, we guarantee that different machines will not generate colliding ObjectIds.
      • To provide uniqueness among different processes generating ObjectIds concurrently on a single machine, the next two bytes are taken from the process identifier (PID) of the ObjectId-generating process.
      • These first nine bytes of an ObjectId guarantee its uniqueness across machines and processes for a single second. The last three bytes are simply an incrementing counter that is responsible for uniqueness within a second in a single process. This allows for up to 256^3 (16,777,216) unique ObjectIds to be generated per process in a single second.

    Auto-generation of _id As stated previously, if there is no “_id” key present when a document is inserted, one will be automatically added to the inserted document. This can be handled by the MongoDB server but will generally be done by the driver on the client side. The decision to generate them on the client side reflects an overall philosophy of MongoDB: work should be pushed out of the server and to the drivers whenever possible. This philosophy reflects the fact that, even with scalable databases like MongoDB, it is easier to scale out at the application layer than at the database layer. Moving work to the client side reduces the burden requiring the database to scale.

Using the Shell

```$ help

show dbs

show collections

show users ```

· linux