CodiMD Note Exporter + Hedgedoc Note Importer
This little tool is intended to help you backup the notes that your CodiMD history shows to a local folder. It complements CodiMD's "export user data" functionality, which only downloads documents that you are the owner of. This backup is not meant to be uploaded by you in its entirety, since this would create ownership issues of shared documents.
The export user data functionality of CodiMD creates an archive containing all the documents you are the owner of. This tool furthermore implements functionality to upload each or a selection of your owned documents based on this archive. Based upon the path the document was served under at CodiMD, this script tries to migrate the document to the exact same path if possible, e.g., https://md.inf.tu-dresden.de/my_custom_path gets mapped to https://md.inf.tu-dresden.de/notes/my_custom_path
Be aware that this approach is a little hacky. Unfortunately, this is necessary since our CodiMD instance did use KeyCloak as an OAuth provider.
Follow the steps closely to ensure that everything works as intended and that the ownership of documents is not messed up.
Requirements
Its only dependency is Python >= 3.7 (use python3 -V
to check your version).
Usage
After cloning this repository, follow the following steps to migrate your documents. If you only want to export your visited documents and import them manually, just follow steps 2 and 4.
1. Export Your Owned Documents From Our Codimd Instance and Weed Them Out
- Go to https://md.inf.tu-dresden.de/
- Click on your username in the upper right and hit "Export user data"
- Download the file "archive.zip" and place it in the same folder as this file (README.md)
- Weed the archive out! Most likely, it will contain a lot of unused or empty documents. Delete those files in place in the archive, don't extract them.
2. Extract the value of your CodiMD session cookie
To ensure that the script can access the documents you accessed in our CodiMD instance, you need to extract the value of
the connect.sid
browser cookie for the
particular CodiMD instance that you use (e.g., md.inf.tu-dresden.de
).
This procedure is necessary because the CodiMD login is handled externally by Keycloak.
WARNING: the value of the cookie is your session ID and as such, should be treated like a password. Don't share it with others!
Now, the instructions are similar for Chrome and Firefox:
- Navigate your browser to the CodiMD instance. Be sure that you are logged in
- Open developer tools (F12).
- Chrome: go to the "Application" tab. Firefox: go to the "Storage" tab ("Web-Speicher").
- Un-collapse "Cookies".
- In the list, search for a cookie with the name
connect.sid
- Select and copy the value. It must start with the character sequence
s%3A
.
3. Extract the value of your HedgeDoc session cookie
Follow the instructions from step 2, but copy the value of the cookie key connect.hedgeDoc.sid
4. Execute the script via the command line
Execute ./md-import-export.py
or python3 md-import-export.py
in a
shell. The script will ask for your session ids and download the notes you visited to the relative path
./codimd-documents
.
Please note that your CodiMD history might reference already deleted notes or notes you no longer have access to. The URLs of these inaccessible notes are listed as part of the output of the script.
Furthermore, the script tries to upload every file with a .md
extension to our HedgeDoc instance.
5. Visit the Uploaded Documents to Make Them Appear in Your Hedgedoc History
Uploading the documents is not enough to make them appear in your HedgeDoc history.
You need to visit them at least once to make them available.
The script automatically generates a file history_scripts/hedgedocu_documents_to_visit.url
.
You can either visit every file manually or execute the bash script history_scripts/visit_migrated_documents.sh
.
This script will ask you for a browser to open the documents and then open them in batches of 50 URLs.
Be aware: this might be pretty resource intensive and might take a while.