4.1 Principles of schema design
some question for table when modeling data with any database system:
- What are your application access patterns?
- What’s the basic unit of data?
- What are the capabilities of your database?
What makes a good unique id or primary key for a record?
4.2 Designing an e-commerce data model
4.2.1 Schema basics
{
_id: ObjectId("4c4b1476238d3b4dd5003981"),
slug: "wheelbarrow-9092",
sku: "9092",
name: "Extra Large Wheelbarrow",
description: "Heavy duty wheelbarrow...",
details: {
weight: 47,
weight_units: "lbs",
model_num: 4039283402,
manufacturer: "Acme",
color: "Green"
},
total_reviews: 4,
average_review: 4.5,
pricing: {
retail: 589700,
sale: 489700,
},
price_history: [
{
retail: 529700,
sale: 429700,
start: new Date(2010, 4, 1),
end: new Date(2010, 4, 8)
},
{
retail: 529700,
sale: 529700,
start: new Date(2010, 4, 9),
end: new Date(2010, 4, 16)
},
],
primary_category: ObjectId("6a5b1476238d3b4dd5000048"),
category_ids: [
ObjectId("6a5b1476238d3b4dd5000048"),
ObjectId("6a5b1476238d3b4dd5000049")
],
main_cat_id: ObjectId("6a5b1476238d3b4dd5000048"),
tags: ["tools", "gardening", "soil"],
}
Unique slug
db.products.createIndex({slug: 1}, {unique: true})
nested documents
- one-to-many relationships
- many-to-many relationships
to query for all products in the Gardening Tools category, the code is simple:{
_id: ObjectId("6a5b1476238d3b4dd5000048"),
slug: "gardening-tools",
name: "Gardening Tools",
description: "Gardening gadgets galore!",
parent_id: ObjectId("55804822812cb336b78728f9"),
ancestors: [
{
name: "Home",
_id: ObjectId("558048f0812cb336b78728fa"),
slug: "home"
},
{
name: "Outdoors",
_id: ObjectId("55804822812cb336b78728f9"),
slug: "outdoors"
}
]
}
To query for all categories from a given product, you use the $in operator:db.products.find({category_ids: ObjectId('6a5b1476238d3b4dd5000048')})
product = db.products.findOne({"slug": "wheelbarrow-9092"})
db.categories.find({_id: {$in: product['category_ids']}})
4.2.2 Users and orders
Finding all orders for a given user:{
_id: ObjectId("6a5b1476238d3b4dd5000048"),
user_id: ObjectId("4c4b1476238d3b4dd5000001"),
state: "CART",
line_items: [
{
_id: ObjectId("4c4b1476238d3b4dd5003981"),
sku: "9092",
name: "Extra Large Wheelbarrow",
quantity: 1,
pricing: {
retail: 5897,
sale: 4897,
}
},
{
_id: ObjectId("4c4b1476238d3b4dd5003982"),
sku: "10027",
name: "Rubberized Work Glove, Black",
quantity: 2,
pricing: {
retail: 1499,
sale: 1299
}
}
],
shipping_address: {
street: "588 5th Street",
city: "Brooklyn",
state: "NY",
zip: 11215
},
sub_total: 6196
}
The query for getting the user for a particular order is equally simple:db.orders.find{user_id: user['_id']}
db.users.findOne({_id: order['user_id']})
{
_id: ObjectId("4c4b1476238d3b4dd5000001"),
username: "kbanker",
email: "kylebanker@gmail.com",
first_name: "Kyle",
last_name: "Banker",
hashed_password: "bd1cfa194c3a603e7186780824b04419",
addresses: [
{
name: "home",
street: "588 5th Street",
city: "Brooklyn",
state: "NY",
zip: 11215
},
{
name: "work",
street: "1 E. 23rd Street",
city: "New York",
state: "NY",
zip: 10010
}
],
payment_methods: [
{
name: "VISA",
payment_token: "43f6ba1dfda6b8106dc7"
}
]
}
{
_id: ObjectId("4c4b1476238d3b4dd5000041"),
product_id: ObjectId("4c4b1476238d3b4dd5003981"),
date: new Date(2010, 5, 7),
title: "Amazing",
text: "Has a squeaky wheel, but still a darn good wheelbarrow.",
rating: 4,
user_id: ObjectId("4c4b1476238d3b4dd5000042"),
username: "dgreenthumb",
helpful_votes: 3,
voter_ids: [
ObjectId("4c4b1476238d3b4dd5000033"),
ObjectId("7a4f0376238d3b4dd5000003"),
ObjectId("92c21476238d3b4dd5000032")
]
}
4.3 Nuts and bolts: on databases, collections and documents
4.3.1 Databases
a database is created automatically once you write to a collection in that database. Have a look at this Ruby code:connection = Mongo::Client.new( [ '127.0.0.1:27017' ], :database => 'garden' )
db = connection.database
You can delete all the data in this collection by calling:products = db['products']
products.insert_one({:name => "Extra Large Wheelbarrow"})
This command doesn’t remove the collection itself; it only empties it. To remove a collection entirely, you use the drop method, like this:products.find({}).delete_many
You can drop the garden database from Ruby like so:products.drop
From the MongoDB shell, run the dropDatabase() method using JavaScript:db.drop
The data files reside in whichever directory you designated as the dbpath when starting mongod. When left unspecified, mongod stores all its files in /data/db.3 Let’s see how this directory looks after creating the garden database:use garden
db.dropDatabase();
First note the mongod.lock file, which stores the server’s process ID.
You can always check the amount of space used versus the amount allocated by using the stats command in the JavaScript shell:
the fileSize field indicates the total size of files allocated for this database.
the dataSize is the actural size of the BSON objects in the database
the storageSize includes extra space reserved for collection growth and also unallocated deleted space
4.3.2 Collections
db.createCollection("users")
db.createCollection("users", {size: 20000})
reanme collection
db.products.renameCollection("store_products")
capped collection is distinguished from standard collections by their fixed size. This means that once a capped collection reaches its maximum size, subsequent inserts will overwirte the least-recently-inserted documents in the collection.
require 'mongo'
VIEW_PRODUCT = 0 # action type constants
ADD_TO_CART = 1
CHECKOUT = 2
PURCHASE = 3
client = Mongo::Client.new([ '127.0.0.1:27017' ], :database => 'garden')
client[:user_actions].drop
actions = client[:user_actions, :capped => true, :size => 16384]
actions.create
500.times do |n| # loop 500 times, using n as the iterator
doc = {
:username => "kbanker",
:action_code => rand(4), # random value between 0 and 3, inclusive
:time => Time.now.utc,
:n => n
}
actions.insert_one(doc)
end
query the collection from the shell:
use garden
db.user_actions.count();
mongodb allows you to specify a maximum number of documents for a capped collection with the max
parameter
this size configuration has precedence
db.createCollection("users.actions", {capped: true, size: 16384, max: 100})
capped collections don’t allow all operations available for a normal collection.
mongodb also allows you to expire documents from a collection after a certain amount of time has passed.
this functionality is actually implemented using a special kind of index.
db.reviews.createIndex({time_field: 1}, {expireAfterSeconds: 3600})
db.reviews.insert({
time_field: new Date(),
...
})
TTL indexes just measure the difference between the indexed value and the current time, to compare to expireAfterSeconds
TTL indexes ‘s restrictions:
can’t have a TTL index on _id or on a field used in another index
can’t use TTL indexexs with capped collections because they don’t support removing individual documents
can’t have compound TTL indexes, though you can have an array of timestamps in the indexes field.
Two of the special system collections are system.namespaces and system.indexes
db.system.namespaces.find()
system.indexes
stores each index definition for the current database
db.system.indexes.find()
4.3.3 Documents and insertion
Document serialization, types, and limits
all documents are serialized to BSON befor being sent to MongoDB; they’re later deserialized from BSON
doc = {
:_id => BSON::ObjectId.new,
:username => "kbanker",
:action_code => rand(5),
:time => Time.now.utc,
:n => 1
}
bson = doc.to_bson
puts "Document #{doc.inspect} takes up #{bson.length} bytes as BSON"
The serialize
method returns a byte array
string_io = StringIO.new(bson)
deserialized_doc = String.from_bson(string_io)
puts "Here's our document deserialized from BSON:"
puts deserialized_doc.inspect
BSON types:
- strings
numbers: double, int , and long
db.numbers.save({n: 5});
db.numbers.save({n: NumberLong(5)})
db.numbers.find({n: 5});
you’ll see that doubles are type 1 and 64-bit integers are type 18. Thus, you can query the collection for values by type:
The only other issue that commonly arises with BSON numeric types is the lack of decimal support. This means that if you’re planning on storing currency values in MongoDB, you need to use an integer type and keep the values in cents.datetimes
- virtual types
- limits on documents
- bulk inserts