Word documents with images to markdown via pandoc

You can easily convert docx files to md via pandoc:

pandoc file.docx -o file.md

Images maybe encoded in the markdown file like so:

![image3](media/image3.jpg)

But we won’t see the media folder in output.

Since docx files are just archives:

M Filemode      Length  Date         Time      File
- ----------  --------  -----------  --------  ------------------------------
  -rw-rw-rw-      4980   4-May-2021  03:40:42  word/footnotes.xml
  -rw-rw-rw-      1341   4-May-2021  03:40:42  word/numbering.xml
  -rw-rw-rw-      1770   4-May-2021  03:40:42  word/settings.xml
  -rw-rw-rw-      1432   4-May-2021  03:40:42  word/fontTable.xml
  -rw-rw-rw-      6753   4-May-2021  03:40:42  word/styles.xml
  -rw-rw-rw-       404   4-May-2021  03:40:42  customXML/itemProps1.xml
  -rw-rw-rw-       514   4-May-2021  03:40:42  customXML/item1.xml
  -rw-rw-rw-       295   4-May-2021  03:40:42  customXML/_rels/item1.xml.rels
  -rw-rw-rw-       480   4-May-2021  03:40:42  docProps/core.xml
  -rw-rw-rw-     22532   4-May-2021  03:40:42  word/document.xml
  -rw-rw-rw-      1484   4-May-2021  03:40:42  word/_rels/document.xml.rels
  -rw-rw-rw-       443   4-May-2021  03:40:42  _rels/.rels
  -rw-rw-rw-      8393   4-May-2021  03:40:42  word/theme/theme1.xml
  -rw-rw-rw-     35436   4-May-2021  03:40:42  word/media/image1.png
  -rw-rw-rw-     10870   4-May-2021  03:40:42  word/media/image2.jpg
  -rw-rw-rw-     10041   4-May-2021  03:40:42  word/media/image3.jpg
  -rw-rw-rw-      1622   4-May-2021  03:40:42  [Content_Types].xml
- ----------  --------  -----------  --------  ------------------------------
                108790                         17 files

We can just extract the relevant parts:

unzip file.docx -d <tmp-folder>

And copy them out.