MongoDB Developer Certification 考證照重點

發表於 2022-08-08 更新於 2023-04-18 分類於軟體開發， MongoDB

這篇文章是 MongoDB Developer 考證照的重點，翻譯與整理自官方的Study Guide，我是當作考前快速複習使用，至少知道每一個項目的內容是什麼。

建議學習方式是直接到官方免費線上課程 MongoDB University，上課時間彈性自由，透過實做與測驗讓你印象更深刻。

考取證照：MongoDB Developer Certification

開發者證照課程

M001 MongoDB Basics
M103 Basic Cluster Administration
M121 Aggregation Framework
M220 MongoDB for Developers
M201 MongoDB Performance
M320 MongoDB Data Modeling

CRUD

CRUD 語法，以及如何使用
MongoDB 支援的資料型別
不會考語法記憶，但需要知道正確語法長相、必填或選填參數。

Create

insert(insert, insertOne, insertMany), update, upsert, save
Bulk insert(ordered, unordered)
_id 對於 CRUD 的影響
ObjectId 建立與應用

Read

find(query, projection, options, sort)
interpret regular expression
array query(VERY important)
findAndModify(aggregation, project, embedded)
find() return cursor
findOne() return single document
cursor methods

Update

了解 update 參數、流程與 save 指令
findAndModify, findAndUpdate
update 參數:
- set, unset
- setOnInsert
- inc, mul, min, max
- rename
- Array:
  - pop, addToSet
  - pull, pullAll
  - push
  - each, slice, sort, position

Delete

drop collection
remove, deleteOne, deleteMany, findOneAndDelete
How to use TTL

Index

Default _id index
Get, Create, Drop index(es)
什麼是 Collection Scan？如何造成的？如何避免？

Index 概念(VERY IMPORTANT)

explain 如何使用、解讀
如何使用 index 排序
什麼是 Covered Queries
ESR 是什麼
Index 對於寫入效能影響

Index Types

了解以下 Index：

Single index
Compound index
- 順序影響
Multikey index
- 限制是什麼
Geospatial index
- 如何建立 2d, 2dsphere index
- geoJSON
  - within a circle
  - Near a point
  - within a polygon
Text index
- 如何建立與使用
- 如何排序結果
Hashed index
- hashed index
- compound hashed index
Wildcard index

Index properties

Unique index
TTL index
Hidden index
Partial index (vs Sparse index)
Hybrid index
Regex 在字串與 index 使用

Data Modeling

給予兩個資料模型，選擇有效率的
瞭解常用的資料結構設計模式
了解 MongoDB 特殊的資料型別好處
了解嵌入和連結資料的差異

Introduction

了解 Working set 的定義
為什麼讀 working set 大小對於讀寫資料效率很重要
用各種方式來資料建模
- GridFS
- Read-only views
- Collations
- Special-case 資料型別

Document structure

嵌入和連結資料的差異
正規化/反正規化的意思

關聯式和 MongoDB 模式

關聯式資料庫和 MongoDB 的差異

one-to-one

定義、適用場景
one-to-one 中，嵌入和連結資料的優缺點

one-to-many

一對多的選擇有哪些
常用的一對多模型
各自的優缺點

Many-to-many

如何替多對多建立模型

Modeling tree structure

常見的樹狀模型
各自的優缺點以及讀寫情況的考量

Schema design patterns

如何建立關鍵字搜尋的模型
如何建立貨幣類型模型

BLOB options

GridFS 可以儲存 BLOB 且能夠查詢的結構
GridFS 的一個文件大約大小
如何儲存、操作 GridFS

Views

什麼是 View？如何建立 View
View 如何使用 Index
能夠使用 View 做甚麼事

Collation and Case Sensitive Indexes

如何在 Mongo shell 中查詢 Collation
Index collation
Collection collation
哪些 Collation 較優先
locale 欄位是甚麼
Which values of strength use diacritics (2+)
The default strength of a collation (3) and what it uses
甚麼樣的 Collation 查詢可以/不可以使用 index
如何建立、使用 case sensitive index

NumberDecimal

為什麼需要 NumberDecimal
NumberDecimal vs NumberLong vs double
如何 insert / store / get
跟十進位數字的精確度比較以及 double 進位等精確度

各種 Pattern

Polymorphic
Attribute
Bucket
Outlier
Computed
Subset
Extended reference
Approximation
Tree
Pre-allocation
Doucment versioning
Schema versioning
Anti-pattern: Massive arrays

Aggregation

stage 與 operators 內容太多了，請直接熟讀官方文件

了解每個 stage 以及意思(所有 stage 語法)
了解多個 $group、$unwind 會發生甚麼事與運作

Operators

所有 Operators 都可以在任意 stage 中使用
有些 stage 獨有的 operators

Mechanics

Memory 限制
如何最佳化 aggregation
aggregation 如何搭配 index 使用
aggregation 所有的 options 及其影響

Replication

replication 是甚麼？好處是？以及在速度和容量上有甚麼 tradeoff
oplog 是甚麼、如何運作，以及是 idempotence 和 statement-based
當節點失效會發生甚麼事情
replication 目的是資料的 high availability 與 durability

Node

Arbiter, Delayed
votes, priority
建立 replica 的語法
了解 initial sync

Election

甚麼時候進行 election
priority, votes, optime 和 unreachable server 對於投票的結果影響
甚麼樣的節點會取得投票結果

Failover

甚麼情況會觸發 Failover
Failover 時，會觸發 Election

Rollback

甚麼情況下會觸發 rollback
當觸發 rollback 時，資料會發生甚麼事？如何 rollback？

rs.status()

完全了解相關內容

Replica set reconfiguration

功能是甚麼？
Add/Remove replica set members

Oplog

idempotent
甚麼樣操作會存在 oplog
寫入時，oplog 如何儲存 document 的 _id
了解寫入，會產生多少 oplog 筆數

Read preference

哪些 node 的設定，可能會被查詢
甚麼情況下會讀取到 stale 資料

Write concern

default 設定是甚麼
如何設定 write concern 為 majority 或是固定數量的節點
當設定這些 write concern，需要有多少節點有資料副本
How to ensure writes get to the journal before the acknowledgment

Sharding

如何建立 sharding
水平擴展
如何選擇正確的 shard key，以及選擇錯誤的損失
了解 load balancer 的腳色
了解 config server 的職責
Sharding 是水平擴展與提升讀寫乘載量
Replication 是提高資料 durability、high availability

Shard key

4.2 開始，shard key 是 mutable，除非他是 _id
甚麼是好的與不好的 Shard key
Range-based shard key 如何運作

Chunk and Balancer

如何用 Shard key 定義 Chunk 範圍
如何知道一個 document 屬於哪個 chunk 範圍
甚麼狀況會自動切割 chunk
balancer 如何透過 chunk 來達到 cluster 平衡

Config server and Cluster metatdata

Config server 是甚麼、職責
Config server 取得資料的流程
如果 config server 失效了，會發生甚麼事
Config servers 是由甚麼類型的 server 所構成
當 Config server 無法被選為 Primary 會發生甚麼事

Server Tools

mongoimport, mongoexport vs mongodump, mongorestore
mongostat, mongotop, mongofiles, bsondump

Storage Engines

WiredTiger concurrency level
WiredTiger 可用的壓縮演算法
WiredTiger 特性:
- Lock/Concurrency (document level)
- Jopurnaling
- Data compression
  - Default: snapp(all collections & prefix compression for all indexes)
  - Collection: zlib, zstd
Data files 長相
In-memory storage engine

祝各位父親節快樂～

作者： MingYi Chou
版權聲明： 轉載不用問，但請註明出處！本網誌均採用 BY-NC-SA 許可協議。