MongoDB¶
The seminar's goal is to understand the concepts of the MongoDB document database and the usage of the MongoDB C#/.NET Driver.
Pre-requisites¶
Required tools to complete the tasks:
- Microsoft Visual Studio 2019 (not VS Code)
- MongoDB Community Edition
- Robo 3T
- Database initialization script: mongo.js
- Starter code: https://github.com/bmeviauac01/gyakorlat-mongo-kiindulo
Recommended to review:
- C# language and Linq queries
- MongoDB lecture
- Using MongoDB guide
How to work during the seminar¶
The exercises are solved together with the instructor. A few exercises we can try to solve by ourselves and then discuss the results. The final exercise is individual work if time permits.
This guide summarizes and explains the behavior. Before looking at these provided answers, we should think first!
Exercise 0: Create database, open starter code¶
-
Create a new working directory, e.g.,
c:\work\NEPTUN
.Working directory
This folder will be our working directory, that is, everything is created within this folder.
-
Start the MongoDB server.
- Create a new folder for the database files, e.g., use name
db
. (Create this within the working directory.) -
Open a command prompt and start MongoDB server:
mongod.exe --dbpath="c:\work\<NEPTUN>\db"
Command prompt can be located in the start menu by typing "cmd".
In the university computer laboratories, MongoDB is in folder
c:\tools\mongodb\bin
. Navigate to this directory using thecd
command. -
Keep this command prompt open because it hosts our server. To stop the server at the end of the seminar, press Ctrl+C.
- Create a new folder for the database files, e.g., use name
-
Create the database.
- Download the database initialization script and save it as
mongo.js
in the working directory. -
Open a new command prompt and initialize database:
mongo.exe localhost:27017/datadriven c:\work\<NEPTUN>\mongo.js
The executable is not mongod but simply mongo. This is a client software to connect to the server and execute queries.
In the university computer laboratories, this executable is located at
c:\tools\mongodb\bin
.Let us note the connection string that contains the name of the database too!
- Download the database initialization script and save it as
-
Check the database using Robo3T.
-
Start Robo3T (in the university computer laboratories, it is located at
c:\tools\robo3t
) and connect to the server. -
Check if the Collections were created.
-
-
Download the starter code!
- Open yet another new command prompt into our working directory.
- Execute the following command:
git clone --depth 1 https://github.com/bmeviauac01/gyakorlat-mongo-kiindulo.git
-
Open the sln file from the newly created folder using Visual Studio.
-
Let us examine this project.
- This is a .NET Core console application. The structure resembles the structure of the Entity Framework project seen before: directory
Entities
contains the database entities while our code will be written toProgram.cs
. Program.cs
already contains the initialization of the connection to MongoDB.- Interface
IMongoClient
is used for all communication with the database. We will not use this directly. - Interface
IMongoDatabase
represents the databasedatadriven
within the MongoDB server. - And the
IMongoCollection<TEntity>
interfaces represent the specific collections we can use to execute queries and modification commands.
- Interface
- The database documents are mapped to the C# entity classes in folder
Entities
. A major difference compared to the behavior previously seen in Entity Framework is that these classes were not generated by written manually.- Most entities are already mapped.
- We will create one more class during an exercise.
- This is a .NET Core console application. The structure resembles the structure of the Entity Framework project seen before: directory
Exercise 1: Queries¶
Write C# code using the MongoDB C#/.NET Driver in the following exercises. Print the results to the console.
-
List the names and the amount of stock of all products that we have more than 30 in stock!
-
List the orders that consist of at least two items!
-
List the orders that have a total value of at least 30.000! For each order, print the customer name, and list all items of the order (with the product name, amount, and price).
-
Find the most expensive product!
-
List the products that have been ordered at least twice!
Solution
-
We need only the product collection and execute a simple query. The filter criteria can be written as a Lambda-expression and with the builder syntax too.
Console.WriteLine("***** Exercise one *****"); // 1.1 first solution Console.WriteLine("\t1.1 First solution:"); var qProductAndStock1 = productsCollection .Find(p => p.Stock > 30) .ToList(); foreach (var p in qProductAndStock1) Console.WriteLine($"\t\tName={p.Name}\tStock={p.Stock}"); // 1.1 second solution Console.WriteLine("\t1.1 Second solution:"); var qProductAndStock2 = productsCollection .Find(Builders<Product>.Filter.Gt(p => p.Stock, 30)) .ToList(); foreach (var p in qProductAndStock2) Console.WriteLine($"\t\tName={p.Name}\tStock={p.Stock}");
-
This is similar to the previous one. We may note that we would have needed a join in a relational database, but we have everything at hand here.
// 1.2 first solution Console.WriteLine("\t1.2 First solution:"); var qOrderItems1 = ordersCollection .Find(o => o.OrderItems.Length >= 2) .ToList(); foreach (var o in qOrderItems1) Console.WriteLine($"\t\tCustomerID={o.CustomerID}\tOrderID={o.ID}\tItems={o.OrderItems.Length}"); // 1.2 second solution Console.WriteLine("\t1.2 Second solution:"); var qOrderItems2 = ordersCollection .Find(Builders<Order>.Filter.SizeGte(o => o.OrderItems, 2)) .ToList(); foreach (var o in qOrderItems2) Console.WriteLine($"\t\tCustomerID={o.CustomerID}\tOrderID={o.ID}\tItems={o.OrderItems.Length}");
-
A simple query is not sufficient for this exercise; thus, we need the aggregation pipeline. We may still note that every information we need is still available in one collection.
// 1.3 Console.WriteLine("\t1.3:"); var qOrderTotal = ordersCollection .Aggregate() .Project(order => new { CustomerID = order.CustomerID, OrderItems = order.OrderItems, Total = order.OrderItems.Sum(oi => oi.Amount * oi.Price) }) .Match(order => order.Total > 30000) .ToList(); foreach (var o in qOrderTotal) { Console.WriteLine($"\t\tCustomerID={o.CustomerID}"); foreach (var oi in o.OrderItems) Console.WriteLine($"\t\t\tProductID={oi.ProductID}\tPrice={oi.Price}\tAmount={oi.Amount}"); }
-
To find the most expensive product, we need two queries: first, find the largest price value, then find the products with this price.
// 1.4 Console.WriteLine("\t1.4:"); var maxPrice = productsCollection .Find(_ => true) .SortByDescending(p => p.Price) .Limit(1) .Project(p => p.Price) .Single(); var qProductMax = productsCollection .Find(p => p.Price == maxPrice) .ToList(); foreach (var t in qProductMax) Console.WriteLine($"\t\tName={t.Name}\tPrice={t.Price}");
-
This exercise is complicated with our current database scheme because we do not have everything at hand within one collection. We need the product information from one collection, and the order details from another one.
We will be doing a "join" in the client-side, that is, in C# code. The solution's outline is to query the orders, then in C# gather the orders by product, and finally, query the product details.
// 1.5 Console.WriteLine("\t1.5:"); var qOrders = ordersCollection .Find(_ => true) .ToList(); var productOrders = qOrders .SelectMany(o => o.OrderItems) // All order items into one list .GroupBy(oi => oi.ProductID) .Where(p => p.Count() >= 2); var qProducts = productsCollection .Find(_ => true) .ToList(); var productLookup = qProducts.ToDictionary(p => p.ID); foreach (var p in productOrders) { var product = productLookup.GetValueOrDefault(p.Key); Console.WriteLine($"\t\tName={product?.Name}\tStock={product?.Stock}\tOrders={p.Count()}"); }
This solution is very elegant and works only for small databases. Suppose we face a similar task under real-life circumstances. We have two choices: denormalize the database scheme and copy product details into the orders, or create an aggregation pipeline executed by the server that does something similar to the code above (MongoDB can do that, but it will not be very fast).
Exercise 2: Create a new entity class¶
-
Examine the classes
Product
andVAT
. Why is there a field with a[BsonId]
attribute in classProduct
and not in classVAT
? -
Create a new entity class for mapping
Category
document, then add and initialize aIMongoCollection<Category>
field next to the others.
Solution
-
Class
Product
represents theproducts
collection; therefore each item has a uniqueObjectID
. On the other hand, classVAT
is an embedded field used byProduct
and has no collection on its own; hence it needs no id. -
Create our new
Category
POCO class.Let us check a few sample documents using Robo3T in
categories
collection.Create a new class
Category
in folderEntities
with matching fields as below.using MongoDB.Bson; using MongoDB.Bson.Serialization.Attributes; namespace BME.DataDriven.Mongo.Entitites { public class Category { [BsonId] public ObjectId ID { get; set; } public string Name { get; set; } public ObjectId? ParentCategoryID { get; set; } } }
Add a new collection interface field in
Program.cs
as follows.private static IMongoCollection<Category> categoriesCollection;
And assign the value in the
initialize
method to get the collection.categoriesCollection = database.GetCollection<Category>("categories");
Exercise 3: Data modification¶
The collection classes IMongoColection<TEntity>
can also be used to execute modification operations.
-
Write C# code that increases the price of all products in category "LEGO" by 10 percent!
-
Create a new category named Expensive toys and move all products here that cost more than 8000!
-
Delete all categories that contain no products.
Solution
-
Find the ID of the category then update all products that have this category id.
Console.WriteLine("***** Exercise three *****"); //3.1 Console.WriteLine("\t3.1:"); var categoryLegoId = categoriesCollection .Find(c => c.Name == "LEGO") .Project(c => c.ID) .Single(); var qProductLego = productsCollection .Find(p => p.CategoryID == categoryLegoId) .ToList(); Console.WriteLine("\t\tBefore modification:"); foreach (var p in qProductLego) Console.WriteLine($"\t\t\tName={p.Name}\tStock={p.Stock}\tÁr={p.Price}"); productsCollection.UpdateMany( filter: p => p.CategoryID == categoryLegoId, update: Builders<Product>.Update.Mul(p => p.Price, 1.1)); qProductLego = productsCollection .Find(p => p.CategoryID == categoryLegoId) .ToList(); Console.WriteLine("\t\tAfter modification:"); foreach (var p in qProductLego) Console.WriteLine($"\t\t\tName={p.Name}\tStock={p.Stock}\tÁr={p.Price}");
-
MongoDB can execute the following sequence of steps in a single atomic step: "Get me category
Expensive toys
. If it does not exist, create it." We will useFindOneAndUpdate
to achieve this.//3.2 Console.WriteLine("\t3.2:"); var catExpensiveToys = categoriesCollection.FindOneAndUpdate<Category>( filter: c => c.Name == "Expensive toys", update: Builders<Category>.Update.SetOnInsert(c => c.Name, "Expensive toys"), options: new FindOneAndUpdateOptions<Category, Category> { IsUpsert = true, ReturnDocument = ReturnDocument.After }); productsCollection.UpdateMany( filter: p => p.Price > 8000, update: Builders<Product>.Update.Set(p => p.CategoryID, catExpensiveToys.ID)); var qProdExpensive = productsCollection .Find(p => p.CategoryID == catExpensiveToys.ID) .ToList(); foreach (var p in qProdExpensive) Console.WriteLine($"\t\tName={p.Name}\tPrice={p.Price}");
-
Query categories that contain any product, then delete the ones that do not belong among this list.
//3.3 Console.WriteLine("\t3.3:"); Console.WriteLine($"\t\tBefore modification: {categoriesCollection.CountDocuments(_ => true)} categories"); var qProductCategory = new HashSet<ObjectId>( productsCollection .Find(_ => true) .Project(p => p.CategoryID) .ToList()); categoriesCollection.DeleteMany(c => !qProductCategory.Contains(c.ID)); Console.WriteLine($"\t\tAfter modification: {categoriesCollection.CountDocuments(_ => true)} categories");
Let us note that this is not an atomic operation. If a product was added concurrently, we could have deleted its category.