Convert Microsoft Word HTML to GitHub Flavored Markdown
- Save a Word documents as HTML
- Convert encoding from
gb2312
toutf8
. And fix thecharset
value in HTML metadata. - Convert the clean html with pandoc
iconv -f gb2312 -t utf-8 input.html | sed -e "s/charset=gb2312/charset=utf8/g" | pandoc -f html -t markdown_github -o output.md