Convert documents to markdown using Python and MarkItDown
Learn how to use Trigger.dev with Python to convert documents to markdown using MarkItDown.
This project uses Trigger.dev v4 (which is currently in beta as of 28 April 2025). If you want to run this project you will need to upgrade to v4.
Overview
Convert documents to markdown using Microsoft’s MarkItDown library. This can be especially useful for preparing documents in a structured format for AI applications.
Prerequisites
- A project with Trigger.dev initialized
- Python installed on your local machine. This example requires Python 3.10 or higher.
Features
- A Trigger.dev task which downloads a document from a URL and runs the Python script which converts it to markdown
- A Python script to convert documents to markdown using Microsoft’s MarkItDown library
- Uses our Python build extension to install dependencies and run Python scripts
GitHub repo
View the project on GitHub
Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.
The code
Build configuration
After you’ve initialized your project with Trigger.dev, add these build settings to your trigger.config.ts
file:
Learn more about executing scripts in your Trigger.dev project using our Python build extension here.
Task code
This task uses the python.runScript
method to run the markdown-converter.py
script with the given document URL as an argument.
Add a requirements.txt file
Add the following to your requirements.txt
file. This is required in Python projects to install the dependencies.
The Python script
The Python script uses MarkItDown to convert documents to Markdown format.
Testing your task
- Create a virtual environment
python -m venv venv
- Activate the virtual environment, depending on your OS: On Mac/Linux:
source venv/bin/activate
, on Windows:venv\Scripts\activate
- Install the Python dependencies
pip install -r requirements.txt
. Make sure you have Python 3.10 or higher installed. - Copy the project ref from your Trigger.dev dashboard and add it to the
trigger.config.ts
file. - Run the Trigger.dev CLI
dev
command (it may ask you to authorize the CLI if you haven’t already). - Test the task in the dashboard by providing a valid document URL.
- Deploy the task to production using the Trigger.dev CLI
deploy
command.
MarkItDown Conversion Capabilities
- Convert various file formats to Markdown:
- Office formats (Word, PowerPoint, Excel)
- PDFs
- Images (with optional LLM-generated descriptions)
- HTML, CSV, JSON, XML
- Audio files (with optional transcription)
- ZIP archives
- And more
- Preserve document structure (headings, lists, tables, etc.)
- Handle multiple input methods (file paths, URLs, base64 data)
- Optional Azure Document Intelligence integration for better PDF and image conversion
Learn more about using Python with Trigger.dev
Python build extension
Learn how to use our built-in Python build extension to install dependencies and run your Python code.