AWS DynamoDB Notes

AWS DynamoDB Notes

AWS DynamoDB is a fully managed key-value store (also document store) NoSQL database as a service provided by Amazon Web Services. Its pricing model is that you only pay for the throughput (read and write) you use instead of the storage usage and the running hours of database instances.

ref:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html
http://www.slideshare.net/AmazonWebServices/design-patterns-using-amazon-dynamodb

Glossary

DynamoDB is schema-less.

  • table: a table is a collection of items.
  • item: an item is a collection of attributes (key-value pairs).
  • attribute: attribute is similar to fields or columns in other databases.
  • primary key: one or two attributes that can uniquely identify every item in a table.
    • partition key (aka hash key): a simple primary key, composed of one attribute.
    • partition key and sort key (aka range key): a composite primary key, composed of two attributes.

ref:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html

Global Secondary Index (GSI)

secondary index 指的是除了 primary key 之外的第二組 key
可以有很多組 secondary index
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SecondaryIndexes.html

GSI 可以用在是 partition key 或 partition + sort key 的 table
GSI 跟 primary key 一樣可以 simple 或是 composite 的
GSI 可以隨時增減

如果你不需要 strong consistency 或個別 partition 的資料量大於 10GB
那就用 GSI

ref:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
http://iamgarlic.blogspot.tw/2015/01/amazon-dynamodb-global-secondary-index.html

Local Secondary Index (LSI)

LSI 只能用在是 partition + sort key 的 table
LSI 必須用原本的 partition key 搭配其他 attribute 做為新的 partition + sort key(LSI 只會是 composite 的)
LSI 只能在建立 table 的時候定義

ref:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html
http://iamgarlic.blogspot.tw/2015/01/amazon-dynamodb-local-secondary-index.html

Query and Scan

能不用 scan 就不用
畢竟這個操作就是去掃 table 裡的所有 item

primary key 和 local secondary index 只能在建立 table 時指定
一旦建立就不能改了
但是 global secondary index 就沒有這個限制

如果是用 partition + sork key 當 primary key
get 的時候要同時給 partition key 和 sort key
query 的時候可以只給 partition key 而 sort key 可給可不給(但是 partition key 一定要給)

無論是當 primary key、GSI 或 LSI
只要是 partition key 的 attribute 一律只能使用 = 來 query
該 attribute 沒有 rich query 的能力(就是 >, <, between, contains 那些條件)
sort key 才會有 rich query

Best Practices
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/BestPractices.html

Choosing a Partition Key
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html

Querying DynamoDB by date
http://stackoverflow.com/questions/14836600/querying-dynamodb-by-date

Pick an item randomly
http://stackoverflow.com/questions/10666364/aws-dynamodb-pick-a-record-item-randomly

ref:
https://www.uplift.agency/blog/posts/2016/03/clearcare-dynamodb
https://medium.com/building-timehop/one-year-of-dynamodb-at-timehop-f761d9fe5fa1#.3g97b3lqy

Commands

DynamoDB is schema-less, so that you can only define keys you need for specifying primary key or local secondary index when creating table.

# 可以用 project name 作為 table name 的 prefix
# 之後可以隨時修改 read / write capacity units
$ aws dynamodb create-table \
--table-name CodeTengu_Preference \
--attribute-definitions AttributeName=name,AttributeType=S \
--key-schema AttributeName=name,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

$ aws dynamodb create-table \
--table-name CodeTengu_WeeklyIssue \
--attribute-definitions AttributeName=number,AttributeType=N AttributeName=publication,AttributeType=S AttributeName=publishedAt,AttributeType=N \
--key-schema AttributeName=number,KeyType=HASH \
--global-secondary-indexes IndexName=publication_published_at,KeySchema='[{AttributeName=publication,KeyType=HASH},{AttributeName=publishedAt,KeyType=RANGE}]',Projection='{ProjectionType=ALL}',ProvisionedThroughput='{ReadCapacityUnits=5,WriteCapacityUnits=5}' \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

$ aws dynamodb create-table \
--table-name CodeTengu_WeeklyPost \
--attribute-definitions AttributeName=issueNumber,AttributeType=N AttributeName=id,AttributeType=N  AttributeName=categoryCode,AttributeType=S \
--key-schema AttributeName=issueNumber,KeyType=HASH AttributeName=id,KeyType=RANGE \
--global-secondary-indexes IndexName=categoryCode_id,KeySchema='[{AttributeName=categoryCode,KeyType=HASH},{AttributeName=id,KeyType=RANGE}]',Projection='{ProjectionType=ALL}',ProvisionedThroughput='{ReadCapacityUnits=5,WriteCapacityUnits=5}' \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

ref:
http://docs.aws.amazon.com/cli/latest/reference/dynamodb/create-table.html
http://docs.aws.amazon.com/cli/latest/reference/dynamodb/update-table.html

$ aws dynamodb put-item \
--table-name CodeTengu_Preference \
--item file://fixtures/curated_api_config.json \
--return-consumed-capacity TOTAL

# fixtures/curated_api_config.json
{
  "name": { "S": "curated_api_config" },
  "apiKey": { "S": "xxx" }
}

ref:
http://docs.aws.amazon.com/cli/latest/reference/dynamodb/put-item.html

$ aws dynamodb get-item \
--table-name CodeTengu_WeeklyIssue \
--key '{"number": {"N": "42"}}'

ref:
http://docs.aws.amazon.com/cli/latest/reference/dynamodb/get-item.html

Usage

你應該用 AWS.DynamoDB.DocumentClient
而不是直接用 AWS.DynamoDB

const AWS = require('aws-sdk');

const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10', region: 'ap-northeast-1' });
const dynamodbClient = new AWS.DynamoDB.DocumentClient({ service: dynamodb });

const params = {
  RequestItems: {
    CodeTengu_Preference: {
      Keys: [
        { name: 'xxx' },
      ],
    },
  },
};

dynamodbClient.batchGet(params, (err, data) => {
  if (err) {
    console.log('fail');
    console.log(err);
  } else {
    console.log('success');
    console.log(data);
  }
});

ref:
http://aws.amazon.com/sdk-for-node-js/
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB/DocumentClient.html

完整的程式碼放在 GitHub 上
https://github.com/CodeTengu/lambdabaku

awscli: Command-line Interface for Amazon Web Services

awscli: Command-line Interface for Amazon Web Services

awscli is the official command-line interface for all Amazon Web Services (AWS).

ref:
https://github.com/aws/aws-cli

Configuration

$ pip install awscli

$ aws configure

ref:
https://docs.aws.amazon.com/cli/latest/index.html

S3

Download A Folder

$ aws s3 sync \
s3://files.vinta.ws/static/images/stickers/ \
.

ref:
https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
https://docs.aws.amazon.com/cli/latest/userguide/cli-services-s3-commands.html#using-s3-commands-managing-objects

Rename A Folder

$ aws s3 cp \
s3://files.vinta.ws/static/images/stickers_BACKUP/ \
s3://files.vinta.ws/static/images/stickers/ \
--recursive

ref:
https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html

Make A Folder Public Read

$ aws s3 sync \
s3://files.vinta.ws/static/ \
s3://files.vinta.ws/static/ \
--grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers

Upload Files

# also make them public read
$ aws s3 cp \
. \
s3://files.vinta.ws/static/images/stickers/ \
--recursive \
--grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers

$ aws s3 cp \
db.sqlite3 \
s3://files.albedo.one/

$ aws s3 sync \
./ \
s3://files.albedo.one/ \
--recursive --exclude "*" --include "*.pickle"

Copy Files Between S3 Buckets

$ aws s3 sync s3://your_bucket_1/media s3://your_bucket_2/media \
--acl "public-read" \
--exclude "track_audio/*"

Remove Files

$ aws s3 rm s3://your_bucket_1/media/track_audio --recursive

ref:
https://docs.aws.amazon.com/cli/latest/reference/s3/rm.html

Send Emails in Django

Send Emails in Django

Sending emails with Amazon SES, Mailgun, Zoho, or Gmail in Django.

Configuration

in settings.py

SERVER_EMAIL = '[email protected]'
DEFAULT_FROM_EMAIL = 'Hourmasters <{0}>'.format(SERVER_EMAIL)
REPLY_TO_EMAIL = '[email protected]'

Amazon SES (Simple Email Service)

  1. 在 Amazon SES 上驗證你的 domain
  2. 在你的 email 服務商(例如 Google Apps)上建立一個 email 帳號,例如 [email protected]
  3. 在 Amazon SES 上驗證這個 email 帳號
  4. 收信,點一下確認信裡的超連結
  5. 在 Amazon SES 上 Request a Sending Limit Increase

如果你沒有 Request a Sending Limit Increase
預設會是在一個 sandbox 裡面
你只能寄信給你有在 Amazon SES 上驗證過的 email 帳號

ref:
https://console.aws.amazon.com/ses/home

$ pip install django-ses
EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = 'email-smtp.us-east-1.amazonaws.com'
EMAIL_HOST_USER = 'YOUR_AWS_ACCESS_KEY_ID'
EMAIL_HOST_PASSWORD = 'YOUR_AWS_SECRET_ACCESS_KEY'
EMAIL_PORT = 587
EMAIL_USE_TLS = True
$ python manage.py ses_email_address -l

ref:
https://github.com/django-ses/django-ses

Mailgun

EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = 'smtp.mailgun.org'
EMAIL_HOST_USER = '[email protected]'
EMAIL_HOST_PASSWORD = 'XXX'
EMAIL_PORT = 587
EMAIL_USE_TLS = True

如果原本就是用 EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
可以無縫改用 https://github.com/pmclanahan/django-celery-email

ref:
http://www.mailgun.com/pricing
http://thingsilearned.com/2011/06/07/mailgun-as-an-smtp-server-for-django-apps/

Zoho

Django 1.7 之前沒有 EMAIL_USE_SSL 這個設定
所以連 zoho 的 mail server 都會 timeout
你可以安裝 django-smtp-ssl

EMAIL_BACKEND = 'django_smtp_ssl.SSLEmailBackend'
EMAIL_HOST = 'smtp.zoho.com'
EMAIL_HOST_USER = '[email protected]'
EMAIL_HOST_PASSWORD = 'XXX'
EMAIL_PORT = 465

ref:
https://github.com/bancek/django-smtp-ssl
https://stackoverflow.com/questions/18335697/send-email-through-zoho-smtp

Gmail

EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend'
EMAIL_HOST = 'smtp.gmail.com'
EMAIL_HOST_USER = '[email protected]'  # 也可以是 Google App
EMAIL_HOST_PASSWORD = 'XXX'
EMAIL_PORT = 587
EMAIL_USE_TLS = True

Usage

in views.py

from django.core.mail import EmailMessage
from django.core.mail import send_mail
from django.template.loader import render_to_string

mail_context = {
    'name': 'Vinta',
    'email': '[email protected]',
    'content': 'YOU SUCK',
}
msg = EmailMessage(
    subject='Subject',
    body=render_to_string('email/contact_email.html', mail_context),
    from_email='[email protected]',
    to=['[email protected]', '[email protected]'],
    headers={'Reply-To': settings.REPLY_TO_EMAIL},
)
msg.content_subtype = 'html'  # or 'plain'
msg.send()

# or

send_mail(
    'Subject',
    'Message',
    'YOUR NAME <[email protected]>',
    ['[email protected]', '[email protected]']
)

ref:
https://docs.djangoproject.com/en/dev/topics/email/

Grant Access to a Single S3 Bucket via Amazon IAM

Grant Access to a Single S3 Bucket via Amazon IAM

Create AN IAM user to only allow to access specific resources.

Go to Users > Attach User Policy > Policy Generator on the web console.

ref:
https://console.aws.amazon.com/iam/home?#users

Example 1

Allow full access to a certain bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::files.albedo.one",
                "arn:aws:s3:::files.albedo.one/*"
            ]
        }
    ]
}

Example 2

For BackWPup, a WordPress plugin:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::files.vinta.ws",
                "arn:aws:s3:::files.vinta.ws/*"
            ]
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*",
                "s3:Put*"
            ],
            "Resource": [
                "arn:aws:s3:::files.vinta.ws",
                "arn:aws:s3:::files.vinta.ws/*"
            ]
        }
    ]
}

ref:
https://console.aws.amazon.com/iam/home#users