将hexo部署到阿里云OSS

最近把博客从wordpress迁移到hexo,主要原因是wordpress太过臃肿。一个个人博客完全不需要那么多的功能。这么多年过去了都没有一个完美的支持markdown方案，然而随着github的流行，程序员社区早已将markdown作为最佳写博客手段！生成静态博客的框架众多。为什么笔者会选择使用hexo。没别的，JS社区一片繁荣(挖坑小能手众多),甚至hexo官网有整站中文化。

注: 该方案同样适用于又拍及七牛

至于怎么安装,如何使用hexo本文就不表述了,请查看官网以及hexo --help。说一下我遇到的几个问题。

对于npm、pip、gem或者其他的包管理系统在国内请使用镜像源,比如中科大、淘宝等

对于osx系统，请使用npm install hexo --no-optional安装，参见官方文档
由于hexo框架的升级或者使用环境的不同。有可能导致你使用的主题出现错误。搜索主题的提问信息一般会有答案
将导出的wordpress文件使用hexo官方推荐的hexo-migrator-wordpress很可能会失败。此时不必纠结,实质就是将xml文件转换成markdown。可用的方法有很多。我使用的是pelican推荐的做法,转换出来有些不标准的需要脚本辅助手工完成

下面说一下安装hexo到阿里云oss的要点，他们最大的区别就是:OSS不能正常的处理请求文件的类型也就是header中Content-Type字段。默认会返回application/octet-stream。意味着请求该资源浏览器会直接下载，而不是正常的解析，所以需要手动指定。另外OSS不能使用相对路径进行访问(因为它的设计就是用作存储，而不是web服务器)。比如访问a.com\b\并不会返回a.com\b\index.html的内容。

解决以上2个问题的方案就是。对每个上传的文件根据文件名后缀得到其mime指定Content-Type字段。将hexo生成的所有html中href进行处理：将他们的路径加上index.html如果此路径在本地存在文件。则将该路径换成绝对路径。例如\b或者\b\换成\b\index.html。python2示例代码如下,依赖requests、lxml

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: ficapy

import os
import requests
import base64
import hmac
import datetime
from os.path import join, exists, relpath
from mimetypes import MimeTypes
from hashlib import sha1, md5
from pickle import dump, loads
from cStringIO import StringIO, InputType, OutputType
from lxml import html

KEY             = '----------'
SECRET          = '----------'
BUCKET          = '----------'
url             = '----------'
ABSOLUTE_PATH   = '/Users/ficapy/CodeSpace/Blog/public'

s = requests.Session()

headers = {
    'Date': datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT'),
    'Content-Type': 'text/html',
    'HOST': url,
}


def signature(file_path):
    global headers
    path = join(ABSOLUTE_PATH, file_path)
    headers['Content-Type'] = MimeTypes().guess_type(path)[0] or 'text/html'
    sign = base64.b64encode(hmac.new(SECRET,
                                     'PUT' + "\n"
                                     + '' + "\n"
                                     + headers['Content-Type'] + "\n"
                                     + headers['Date'] + "\n"
                                     + '/' + BUCKET + '/' + file_path, sha1).digest()).strip()
    headers.update({'Authorization': 'OSS ' + KEY + ':' + sign})


def read_in_chunks(file_object, blocksize=1024):
    file_object = file_object if isinstance(file_object, (InputType, OutputType)) else open(
        join(ABSOLUTE_PATH, file_object))
    file_object.seek(0)
    while 1:
        data = file_object.read(blocksize)
        if not data:
            break
        yield data


def upload(file_path, file_object):
    signature(file_path)
    s.put('http://' + url + '/' + file_path, headers=headers, data=read_in_chunks(file_object)).raise_for_status()
    print('upload done: ' + file_path)


# 首先请求oss_fuck_md5文件,没有则全部上传  有则和生成的md5对比,本地文件不存在或变动则上传
# 所有文件上传成功后最后上传.oss_fuck_md5文件
def local_md5():
    file_md5_mp = {}
    for root, dirnames, files in os.walk(ABSOLUTE_PATH):
        for file in files:
            fullpath = join(root, file)
            rel = relpath(fullpath, ABSOLUTE_PATH)
            file_md5_mp[rel] = md5(open(fullpath).read()).hexdigest()
    return file_md5_mp


def remote_md5():
    a = s.get('http://' + url + '/oss_fuck_md5')
    if a.status_code != 200:
        return {}
    else:
        return loads(a.content)


def path_process(file_path):
    file_path = join(ABSOLUTE_PATH, file_path)
    tree = html.parse(file_path).getroot()
    hrefs = tree.xpath('//*[@href]')

    for i, href in enumerate(hrefs):
        rewrite = join(href.attrib['href'].split()[0], 'index.html')
        if exists(join(ABSOLUTE_PATH, rewrite[1:])):
            hrefs[i].attrib['href'] = rewrite
    f = StringIO()
    f.write(html.tostring(tree))
    return f


def main():
    remote = remote_md5()
    locals = local_md5()
    _locals = locals.copy()
    for local in _locals.keys():
        if _locals[local] != remote.get(local, None):
            continue
        else:
            locals.pop(local)

    for path in locals.keys():
        if path.endswith('html'):
            upload(path, path_process(path))
        else:
            upload(path, path)

    f = StringIO()
    dump(_locals, f)
    upload('oss_fuck_md5', f)


if __name__ == '__main__':
    main()

接下来就是绑定域名了,oss管理页面找找就有了。灰常简单，在下一篇将会阐述如何使用免费的letsencrypt生成证书绑定在阿里云CDN上以及做一些简单的优化。

将hexo部署到阿里云OSS

ficapy

博客起航

IDM-Internet Download Manager

everything--文件搜索神器

everything进阶教程

Snagit-截图兼图片转换软件

xnview-图片查看

DeskPins-窗口置顶

傲梅分区助手--傻瓜化分区软件

excel数字递增批量打印

pip使用国内镜像