Mega AI Agent Architecture and Workflow
The Mega AI Agent is a modular external service (e.g. a CLI or web API) that orchestrates PDF input parsing,
requirement extraction, and n8n JSON generation. Its core components include a PDF Ingestion & Parser,
a Natural Language Config Extractor, an n8n Workflow Builder, and auxiliary modules for media
processing, scheduling, and logging. Each module acts as a standalone tool or “sub-agent” – for example,
one component reads and extracts text from the PDF (using libraries like PyMuPDF or PDFPlumber 1 ),
another interprets that text to identify API endpoints, LLM choices, formats, and schedules, and a final
component assembles the nodes and connections of an n8n workflow in JSON form. The architecture
supports a human-in-the-loop or iterative refinement process, similar to agent-with-loop diagrams above,
so the user can review and adjust the generated workflow and then re-run the agent
. This modular design ensures that the AI agent runs outside of n8n (as a CLI or service) and outputs a
workflow JSON that can be directly imported into an n8n instance 2 .
1
Key Modules
• PDF Parser (Input Layer): Ingests the PDF plan and extracts raw text. Use a Python PDF library (e.g.
PyMuPDF or PDFPlumber) to read pages and extract their text content 1 . Optionally apply OCR or
cleanup. Output is the raw instructions in text form.
• Requirement Extractor (Config Layer): Processes the extracted text (potentially via an LLM) to fill a
structured configuration object. This includes fields like “API endpoints” (e.g. verse API URL), “LLM
model” (e.g. Mistral-7B or GPT-4), “post frequency” (e.g. cron expression for 3x/day), “media
format” (image/video, carousel, audio flag), and “caption template”. One can prompt an LLM (OpenAI
or an open-source model) to identify and categorize these requirements. The output is a JSON config
(or Python dict) detailing endpoints, formatting rules, schedule, etc.
• n8n Workflow Builder: Translates the configuration into an n8n workflow JSON. Based on the
config, this module generates nodes and connections:
• Trigger Node: e.g. a “Schedule Trigger” set to run at specified intervals (n8n’s Schedule Trigger node
supports cron intervals 3 ).
• Data Fetch Nodes: HTTP Request nodes to call the specified APIs (e.g. fetching verses or captions).
• LLM Processing Node: An AI node (OpenAI, HuggingFace, etc.) to clean/format text. For example,
using the OpenAI node with the chosen model to process captions.
• Media Generation Node: A Function or Code node (Python or JS) that overlays text onto image
templates. For example, use Python’s Pillow library to draw text on an image 4 . If audio (verse
recitation) is needed, include an “Execute Command” node that runs an ffmpeg command to
combine the image and audio into a video 5 .
• Upload Node: A node to send media to a CDN/storage (like an S3 or Cloudinary HTTP request using
credentials).
• Instagram Publish Node: The n8n Facebook/Instagram Graph API node (“Set Instagram
Parameters” + “Publish” steps) to post the media. For example, a “Set Instagram Parameters” node
can be configured with image URL, caption, and Instagram Business Account ID 6 . Then an
“Instagram Publish” node posts it.
• Logging Nodes: Nodes to record actions or results. For instance, use the Google Sheets node to
append a log row, or the Notion node to create a log entry. (n8n has built-in Google Sheets and
Notion integrations 7 for these purposes.)
• Control Logic: Conditional or Split nodes to handle success/failure, and possible iterations (e.g., for
multi-image carousels).
All these nodes and their properties are assembled into a JSON structure. Note that n8n workflows are fully
described by JSON objects – this can be verified in the n8n docs 2 . The Mega Agent’s builder constructs
this JSON so it can be imported via n8n’s “Import from JSON” function or CLI.
• Media Processing (Support Layer): Implements the actual image and video creation. For text-on-
image, use Pillow (Python Imaging Library) to open a template image and draw text 4 . For adding
audio to an image to make a video, use ffmpeg via a subprocess. For example:
ffmpeg -loop 1 -y -i image.jpg -i sound.mp3 -shortest -pix_fmt yuv420p
output.mp4
2
as shown in ffmpeg examples 5 . These media assets are then uploaded (e.g. via AWS S3 boto3 in
Python, or an n8n Amazon S3 node) for public access.
• Integration & Posting Logic: This includes configuring and using the Instagram Graph API through
n8n nodes. The agent ensures the workflow has nodes to handle Instagram auth and post upload.
For instance, an example n8n Instagram workflow uses a “Prepare Media” step (HTTP request to
Facebook API) and a “Publish Media” step once the upload is complete 8 . The agent will embed
these nodes with the right parameters from the PDF plan (e.g. business account ID, captions).
• Scheduling & Logging: Uses n8n’s Schedule Trigger (setting days, hours, or custom cron) to
implement the requested posting frequency 3 . For logging, the agent adds Google Sheets or
Notion nodes (n8n has these built-in) to record each run’s status 7 . For example, after posting, a
Google Sheets node could append a row with timestamp and post ID.
• Interactive Refinement Interface: The agent should support a conversational feedback loop. This
could be a CLI prompt or chatbot: after generating the workflow JSON, it asks the user to confirm or
adjust parameters. If there are errors, the user can say “Fix X,” and the agent re-runs the builder.
Technically, this might use an LLM (e.g. ChatGPT) to parse user feedback and modify the JSON
structure, then re-serialize it. This loop could be implemented with a simple REPL or an LLM-driven
interface. The UI is external to n8n, so the agent can easily re-generate the JSON as needed.
Tools and Libraries
• PDF Parsing: Use PyMuPDF or PDFPlumber in Python. For example, with pdfplumber: pdf =
pdfplumber.open('plan.pdf'); text = pdf.pages[0].extract_text() 1 . These
libraries handle extracting text blocks and tables from PDFs.
• Language Models (LLMs): Use the OpenAI Python library for GPT-4/GPT-3.5, or HuggingFace
transformers for models like Mistral. The agent will call these via API to parse requirements or
format text.
• Image Processing: Use Pillow to overlay text on images 4 . Pillow’s ImageDraw and ImageFont
can render captions on background templates.
• Video/Audio (ffmpeg): Invoke ffmpeg (via subprocess) to merge images and audio. The agent must
include ffmpeg binary (or use a wrapper). E.g. subprocess.run(...) with the command from
5 .
• HTTP & API Calls: Use the requests library (or built-in n8n HTTP Request nodes) to test or call the
CDNs (S3/Cloudinary) and Instagram Graph API. AWS SDK ( boto3 ) can upload files to S3.
Cloudinary has a Python SDK if preferred.
• n8n JSON Handling: Use Python’s json module to build the workflow object. Reference n8n’s JSON
schema where possible; recall that “n8n saves workflows in JSON format” 2 .
• CLI / Interactive UI: Use a CLI framework like Typer or Click in Python to handle commands and
prompts. Optionally a minimal FastAPI app if an HTTP interface is desired.
• Testing: Use pytest or unittest to validate each module with sample inputs. For example, test the
PDF Parser on a known PDF, ensure the extracted config matches expectations.
3
Testing and Deployment
• Unit/Integration Tests: Write tests for each module. For PDF parsing, supply a sample PDF and
assert correct text is extracted. For config extraction, use a fixed text prompt and compare against
expected JSON fields. For the workflow builder, feed a mock config and verify the JSON contains the
right nodes/connections. You can import the generated JSON into a test n8n instance (using n8n’s
CLI import command or via the UI) to ensure validity 2 .
• Continuous Integration: Use GitHub Actions or similar to run tests on push. Include a step that
calls n8n import:workflow on the JSON to catch import errors.
• Command-Line Tool: Package the agent as a Python application (with setup.py or
pyproject.toml ). Provide a CLI (e.g. mega-agent generate --input plan.pdf --output
workflow.json ) and log actions.
• Dockerization: For reproducibility, offer a Docker image bundling Python, ffmpeg, and any other
tools. This image can be used to run the CLI in any environment.
• Interactive Mode: In the CLI or service, implement a simple REPL: after generating the JSON, print a
summary and ask “Approve workflow? (yes/no)”. On “no”, allow text feedback and loop back. You can
even use an LLM to interpret the feedback (e.g. “Change API endpoint to ...”).
• n8n Integration Testing: Finally, test by importing the JSON into a real n8n setup and manually
checking that the workflow runs as intended (for example, simulate a day’s scheduled runs).
Example Input/Output Flow
Input PDF snippet:
Instagram Channel: @DailyVerse
Post type: Image carousel with audio (verse + purport).
API endpoints: verse API at https://api.example.com/verse, purport API at
https://api.example.com/purport.
LLM: Mistral-7B for text cleanup.
Schedule: 3 posts per day (8am, 1pm, 8pm).
Caption template: " {text} #DailyVerse".
Image template: verse_background.jpg.
Upload CDN: use S3 bucket my-bucket.
Logging: append log to Google Sheet.
Extracted config: (example JSON of parsed settings)
{
"channel": "@DailyVerse",
"apis": {
"verse": "https://api.example.com/verse",
"purport": "https://api.example.com/purport"
},
"llm_model": "mistral-7b",
4
"schedule_cron": "0 8,13,20 * * *",
"caption_template": " {text} #DailyVerse",
"image_template": "verse_background.jpg",
"media_type": "image_carousel",
"include_audio": true,
"cdn": { "type": "s3", "bucket": "my-bucket" },
"log_sheet": "GoogleSheetID"
}
Generated n8n workflow JSON (abridged):
{
"nodes": [
{
"parameters": {
"cronExpression": "0 8,13,20 * * *"
},
"name": "Schedule Trigger",
"type": "n8n-nodes-base.scheduleTrigger"
},
{
"parameters": {
"url": "={{$json[\"apis\"][\"verse\"]}}",
"responseFormat": "json"
},
"name": "Fetch Verse",
"type": "n8n-nodes-base.httpRequest"
},
{
"parameters": {
"url": "={{$json[\"apis\"][\"purport\"]}}",
"responseFormat": "json"
},
"name": "Fetch Purport",
"type": "n8n-nodes-base.httpRequest"
},
{
"parameters": {
"model": "mistral-7b",
"prompt": "Clean and format the text: {{$json[\"Fetch Verse\"]
[\"text\"]}}"
},
"name": "LLM Clean Text",
"type": "n8n-nodes-base.openAi"
},
{
5
"parameters": {
"functionCode": "from PIL import Image, ImageDraw, ImageFont\nimg =
Image.open('verse_background.jpg')\ndraw = ImageDraw.Draw(img)
\ndraw.text((10,10), item['fields']['caption'],
font=ImageFont.truetype('arial.ttf', 24), fill='white')\nimg.save('out.png')"
},
"name": "Render Image",
"type": "n8n-nodes-base.python"
},
{
"parameters": {
"command": "ffmpeg -loop 1 -y -i out.png -i verse_audio.mp3 -shortest -
pix_fmt yuv420p video.mp4"
},
"name": "Add Audio",
"type": "n8n-nodes-base.executeCommand"
},
{
"parameters": {
"functionCode": "import boto3\ns3 = boto3.client('s3')
\ns3.upload_file('video.mp4', 'my-bucket', 'video.mp4')"
},
"name": "Upload to S3",
"type": "n8n-nodes-base.python"
},
{
"parameters": {
"url": "https://graph.facebook.com/v16.0/17841400008460056/media",
"bodyParametersUi": {
"parameter": [
{"name": "image_url", "value": "https://s3.amazonaws.com/my-bucket/
video.mp4"},
{"name": "caption", "value": "={{$json[\"LLM Clean Text\"]
[\"choices\"][0][\"text\"]}}"}
]
}
},
"name": "Prepare IG Media",
"type": "n8n-nodes-base.httpRequest"
},
{
"parameters": {
"url": "https://graph.facebook.com/v16.0/17841400008460056/
media_publish",
"bodyParametersUi": {
"parameter": [
{"name": "creation_id", "value": "={{$node[\"Prepare IG
Media\"].json[\"id\"]}}"}
6
]
}
},
"name": "Publish IG",
"type": "n8n-nodes-base.httpRequest"
},
{
"parameters": {
"operation": "append",
"sheetId": "GoogleSheetID",
"range": "Sheet1!A1:C1",
"fields": "A,B,C"
},
"name": "Log Result",
"type": "n8n-nodes-base.googleSheets"
}
],
"connections": {
"Schedule Trigger": {"main": [[{"node": "Fetch Verse", "type": "main",
"index": 0}]]},
"Fetch Verse": {"main": [[{"node": "Fetch Purport", "type": "main",
"index": 0}]]},
"Fetch Purport": {"main": [[{"node": "LLM Clean Text", "type": "main",
"index": 0}]]},
"LLM Clean Text": {"main": [[{"node": "Render Image", "type": "main",
"index": 0}]]},
"Render Image": {"main": [[{"node": "Add Audio", "type": "main", "index":
0}]]},
"Add Audio": {"main": [[{"node": "Upload to S3", "type": "main", "index":
0}]]},
"Upload to S3": {"main": [[{"node": "Prepare IG Media", "type": "main",
"index": 0}]]},
"Prepare IG Media": {"main": [[{"node": "Publish IG", "type": "main",
"index": 0}]]},
"Publish IG": {"main": [[{"node": "Log Result", "type": "main", "index":
0}]]}
}
}
This JSON illustrates the final n8n workflow. It includes a schedule trigger (set to run at 8am, 1pm, 8pm),
HTTP nodes to fetch data, an AI node to format text, a Python node to render images (using Pillow 4 ), an
ffmpeg command to add audio 5 , and Instagram API calls, finishing with a Google Sheets log. Importing
this JSON into n8n (via the UI or CLI) will create the configured automation 2 .
References: See n8n documentation on JSON workflows 2 , the Instagram posting example 6 , and tools
for image/text handling 4 5 and PDF parsing 1 . The agent can use n8n’s built-in Google Sheets and
7
Notion nodes for logging 7 and its Schedule Trigger supports cron expressions 3 . These components
together fulfill the requirements and allow iterative refinement of the workflow.
1How To Easily Extract Text From Any PDF With Python | by Vinicius Porfirio Purgato | Analytics Vidhya |
Medium
https://medium.com/analytics-vidhya/how-to-easily-extract-text-from-any-pdf-with-python-fc6efd1dedbe
2 Export and import workflows | n8n Docs
https://docs.n8n.io/workflows/export-import/
3 Schedule Trigger node documentation | n8n Docs
https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.scheduletrigger/
4 A Guide to Adding Text to Images with Python | Cloudinary
https://cloudinary.com/guides/image-effects/a-guide-to-adding-text-to-images-with-python
5 Combine one image + one audio file to make one video using FFmpeg - Super User
https://superuser.com/questions/1041816/combine-one-image-one-audio-file-to-make-one-video-using-ffmpeg
6 8 Simple Social: Instagram Single Image Post with Facebook API | n8n workflow template
https://n8n.io/workflows/2537-simple-social-instagram-single-image-post-with-facebook-api/
7 Exporting and importing workflows | n8n Docs
https://docs.n8n.io/courses/level-one/chapter-6/