Sign In

Sign In with email

No account? Register now Forgotten password? We will keep you signed in until you sign out at any point in time. Please sign out after using our services if you are using a shared device. Privacy Policy applies.

Register

Already registered? Sign in We will keep you signed in until you sign out at any point in time. Please sign out after using our services if you are using a shared device. Privacy Policy applies.

Introduction

The Aramid APIs are organized around REST. Our APIs have predictable resource-oriented URLs, accept form-encoded, raw, binary and JSON request bodies, return JSON encoded, binary and plaintext responses, use standard HTTP response codes, JWT authentication and extended HTTP verbs.

The APIs make use of Role Based Account Control, the API key you use to authenticate the request determines what permissions are granted or denied for each request. You can manage your API keys and their endpoint permissions in your account.

Each API method has an associated price per call, amount which is deducted from your account credit balance per api method invocation. You can view your account credit balance and purchase usage credit in your account, directly from the dashboard or from the billing section.

REQUEST BASE URL

https://api.aramid.io

Authentication

The Aramid APIs use API keys to authenticate requests. An API key is comprised of two parts: the user and the secret - think of these as username/password combinations. You can manage your API keys in your account.

All APIs use a unified authentication and authorization mechanism based on JWT - you obtain JWT bearers via a common endpoint and you subsequently use them for all requests across all APIs.

Workflow

Issue a POST call to the /token endpoint, with a JSON payload containing the API user/secret combo you wish to use. You will obtain a Bearer type token that has an expiry in seconds specified by ttl and a refresh token that has an expiry in seconds specified by ttlRefresh.
Cache the token for ttl seconds and the refresh token for ttlRefresh seconds.
Start using our authenticated APIs - issue calls with a header of Authentication: Bearer [token] using the token from your cache (see examples below).
If your cached token expires, or if you try to use an expired token and you receive a HTTP 401 UNAUTHORIZED error from any of the authenticated APIs, use the refresh token from your cache by issuing another POST call to the /token endpoint, this time using the refresh token from your cache as bearer, Authentication: Bearer [refresh_token], and an empty payload. You will again obtain a Bearer type token and a refresh token - re-cache them for subsequent requests, overwriting any previously stored token or refresh and retry the authenticated API call with the newly obtained token.
The refresh token is single use and expires - if your use case requires infrequent calls, longer than the refresh window, or if your refresh token has expired and you obtain a HTTP 401 UNAUTHORIZED error while trying to use it, you will have to restart the authentication workflow by POSTing your API key credentials as described in step 1.

API keys have access permissions based on method and endpoint granularity. You can grant or revoke individual permissions for each key at the moment you create new keys or modify them subsequently. Since keys get embedded in your production code, it is recommended to create multiple keys with minimal required permissions, scoped to different application areas - for example, you could have one set of keys with more relaxed permissions (even all permissions) for your backend and another set of keys with just the specific permission needed to achieve functionality for your frontend - given the fact that in most cases frontend code is distributed to clients, it is a good securty practice to limit access to resources as much as possible.

If you need to rotate keys based on your security policy or to combat misuse, you can achieve this by creating a new key, replacing it in your application and deleting the previously used key in your account.

POST /token PRICE: $0.00

curl --location --request POST 'https://api.aramid.io/token' \
--header 'Content-Type: application/json' \
--data-raw '{
    "user":"[API_USER]",
    "secret":"[API_USER_SECRET]",
}'

RESPONSE

{
    "type": "Bearer",
    "token": "eyJ0e[...JWT_TOKEN]",
    "ttl": 900,
    "refresh": "hjB3i[...REFERSH_TOKEN]",
    "ttlRefresh": 1815300
}

REQUEST

curl --location --request GET 'https://api.aramid.io/example' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer eyJ0e[...JWT_TOKEN]' \
--data '{
    "foo":"bar"
}'

POST /token PRICE: $0.00

curl --location --request POST 'https://api.aramid.io/token' \
--header 'Authorization: Bearer hjB3i[...REFRESH_TOKEN]' \

RESPONSE

{
    "type": "Bearer",
    "token": "eyJ0e[...NEW_JWT_TOKEN]",
    "ttl": 900,
    "refresh": "hjB3i[...NEW_REFERSH_TOKEN]",
    "ttlRefresh": 1815300
}

OAuth 2.0 Compatibility

For backward compatibility with OAuth 2.0 authorization, the Aramid APIs support alternative names for the /token input parameters, which are enabled by specifying the OAuth 2.0 grant type.

You don't need to do anything special to enable this compatibility layer: simply send a grant_type=client_credentials parameter, along with the client_id and client_secret as per OAuth 2.0 guidance.

In doing so, the /token endpoint will return its contents under the expected type, access_token and expires_in fields, enabling full interoperability with OAuth 2.0 capable clients and libraries; the token refresh capability is maintained via the fallback re-issuance mechanism.

This ability to interop with OAuth 2.0 clients permits you, amongst other things, to make use of Authorization token management features in API clients such as Postman, Insomnia, Apidog or Bruno (recommended).

To enable automatic token management in such an API Development Client, use the following settings:

Authorization type: OAuth 2.0
Add authorization data to: Request Headers
Header Prefix: Bearer
Grant type: Client Credentials
Access Token URL: https://api.aramid.io/token
Client ID: [API_USER]
Client Secret: [API_USER_SECRET]
Client Authentication: Send client credentials in body
Refresh Token URL: https://api.aramid.io/token

POST /token PRICE: $0.00

curl --location --request POST 'https://api.aramid.io/token' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'grant_type=client_credentials' \
--data-urlencode 'client_id=[API_USER]' \
--data-urlencode 'client_secret=[API_USER_SECRET]'

RESPONSE

{
    "type": "Bearer",
    "access_token": "eyJ0e[...JWT_TOKEN]",
    "expires_in": 900,
}

Errors

Aramid uses conventional HTTP response codes to indicate the success or failure of an API request. In general: Codes in the 2xx range indicate success. Codes in the 4xx range indicate an error that failed given the information provided (e.g., a required parameter was omitted, authentication failed, etc.). Codes in the 5xx range indicate an error with Aramid's servers (these are rare).

Some 4xx errors that could be handled programmatically (e.g., authorization failed) include an error code that briefly explains the error reported.

The fingerprint provided in both the response headers and in the error response bodies can be used for tracing calls in logs.

ERROR RESPONSE EXAMPLE

{
    "code": 401,
    "message": "Unauthorized",
    "fingerprint": "1887500330234359668036534417416783032005"
}

HTTP STATUS CODE SUMMARY

200	OK	Everything worked as expected.
400	Bad Request	The request was unacceptable, often due to missing a required parameter.
401	Unauthorized	No valid API key or Authorization header provided.
402	Payment Required	The request has a price associated to it and your credit balance cannot satisfy it. Buy credits.
403	Forbidden	The API key doesn't have permissions to perform the request.
404	Not Found	The requested resource doesn't exist.
409	Conflict	The request conflicts with another request.
429	Too Many Requests	Too many requests hit the API too quickly. We recommend an exponential backoff of your requests.
500, 502, 503, 504	Server Errors	Something went wrong on Aramid's end. (These are rare.)

Parse2JSON

🚀 Turn Any File Into Structured JSON - Automatically

Meet Parse2JSON - The smart API that transforms raw content into clean, usable data.

Documents. PDFs. Images. Web pages. Emails. Plain text. You name it - just send it to Parse2JSON and get back perfectly structured JSON.

Whether you're building automation workflows, feeding AI models, or just tired of regex nightmares, Parse2JSON is the no-fuss, plug-and-play solution to extract meaning from messy content.

🧠 Built-In AI. Flexible Control.

Not every project needs the same level of direction. That's why Parse2JSON offers three powerful modes of parsing:

🔍 Auto Mode - Let the AI do its thing. It reads the content, figures out what's important, and gives you structured output - all on its own.

🗝️ Keys Mode - Want specific fields? Just tell us the keys you care about. Parse2JSON will hunt them down and fill them in.

🧩 Schema Mode - Need full control? Send a JSON schema, and we'll fill it out. You'll either get a perfectly matching object - or a helpful error if something doesn't fit.

📦 One Simple Endpoint

Send a POST request to a single HTTPS endpoint. That's it.

✅ Supports Content-Type: application/json (recommended)
✅ Or use form-data if you're feeling old-school
✅ Just two fields: your content, and how you want it parsed
✅ Always returns clean, valid JSON

📄 Input Anything

📃 Text

🖼️ Images (OCR ready)

📄 PDFs

🔗 URLs

🧠 Even base64 blobs

If it has content, we can parse it.

🔐 Smart, Honest Output

Parse2JSON doesn't make stuff up. In schema mode, you get strict adherence to your schema - or an error if it can't comply. For all other modes, you get the best-effort structured JSON based on your hints or content.

🎯 Use Cases

Here's where Parse2JSON really flexes its versatility:

📄 Invoice Parsing

Automatically extract invoice numbers, dates, line items, total amounts, and payment terms from invoices - scanned, photographed, or digital.

👩‍💼 Resume Screening

Pull structured data like candidate names, contact info, work history, skills, and certifications from messy resumes - ideal for HR automation and ATS systems.

📬 Email Triage

Convert email bodies into structured summaries: who wrote it, what it's about, deadlines, ticket numbers, and action items - perfect for customer support, sales, or project inboxes.

📚 Document Summarization & Indexing

Take long documents like research papers, reports, legal memos, or internal docs and convert them into structured summaries, tables of contents, and metadata for search and reference.

📜 Contract Data Extraction

Extract key contract clauses, party names, start/end dates, renewal terms, and responsibilities - with high accuracy across varying legal templates.

🧠 Knowledge Ingestion for AI

Convert large sets of internal documents, manuals, or whitepapers into JSON format to feed vector databases, fine-tuning pipelines, or RAG systems for LLMs.

📝 Form Processing

Parse scanned forms (tax returns, insurance claims, feedback surveys) and extract field-level data into structured JSON - even with checkboxes and handwriting.

📊 Report Digitization

Extract tables, charts, KPIs, and narrative summaries from PDF reports like bank statements, investment summaries, or performance reviews.

⚖️ Regulatory Compliance & Monitoring

Extract relevant regulatory language, clauses, or flagged phrases from submitted documents to support legal review, KYC, or audit workflows.

🔍 Identity Document Parsing

Read and extract information from passports, driver's licenses, national IDs - perfect for onboarding, age verification, and identity verification flows.

💬 Chat Transcript Analysis

Convert unstructured chat logs into structured conversational summaries with participants, topics, sentiment analysis, and key action points.

🧾 Receipt Parsing

Pull merchant names, totals, item lists, and payment methods from uploaded receipts for expense tracking or accounting tools.

📢 Press Release & News Extraction

Turn PR documents or scraped news articles into structured entities: company names, event dates, quotes, and story highlights - great for media monitoring and trend analysis.

🗂️ Bulk Data Extraction from Archives

Parse large archives of scanned files, PDFs, or old docs into usable structured formats - perfect for digitizing legacy systems or historical records.

🧪 Scientific Paper Analysis

Extract titles, authors, abstracts, methods, and findings from academic PDFs to support meta-analysis, discovery tools, or AI model training.

📦 Product Info Extraction

Scrape product listings, brochures, or spec sheets to pull out structured product data: names, SKUs, specs, pricing, and descriptions.

🏢 Business Card Parsing

Extract contact information from images or scans of business cards: name, title, phone, email, company - for CRM automation.

🌐 Website Content Structuring

Convert webpage text into structured content blocks: titles, headers, images, lists - ideal for content republishing, migration, or SEO analytics.

📞 Call Summary Extraction

Parse call transcripts or meeting notes into structured summaries: participants, topics, decisions, follow-ups - great for CRM logging or team handovers.

⚡ Parse2JSON is Your API for All of It

Whether you're working with messy user uploads, legacy documents, or structured business content, Parse2JSON makes it easy to convert any format into clean, reliable JSON - ready to power your apps, workflows, or AI pipelines.

One endpoint. Endless formats. Structured results.

Introduction

The Parse2JSON API is a single HTTPS endpoint that accepts requests via the POST method. Inputs can be sent either as form-data fields or, preferably, as JSON with the Content-Type: application/json header. Each request must include up to two primary fields:

the content to be parsed,
and a mode specifier that defines how the content should be analyzed and structured into JSON output.

The API always returns a valid JSON object. Depending on the chosen mode:

In most modes, the output loosely follows the requested structure, making a best-effort attempt to fill in the expected fields.
In schema mode, the output strictly conforms to the provided JSON schema; if strict conformance isn't possible, the API returns an error rather than an invalid or incomplete result.

The API is backed by AI, using a dedicated text comprehension LLM.

Pricing is per call, factoring in the base cost of processing plus a differential expenditure for AI input and output tokens - one token is roughly 4 natural language english words.

Input

content FileHandle | DataUri | UrlString | RawString | RawArray
Mandatory. The input data to perform processing on. Can be a form-data file upload, OR a base64 encoded data uri, OR a valid url where content is to be downloaded from, OR a raw string OR a valid JSON object/array (to derive schema from)

keys SafeString | SafeStringArray | null
Optional. Can be either a list of comma separated values consisting of key names, OR a JSON array of key names OR a JSON object of key names and type hint values to loosely structure the output against.

schema RawArray | null
Optional. A valid JSON array/object schema to strictly structure the output against.

prompt RawString | null
Optional. A raw string text input consisting of natural language instructions on how the parser should extract and format content.

template RawString | null
Optional. A raw string text input consisting of a simple repeatable text pattern containing variables to be extracted in double curly braces, e.g.: {{variable}}

suggest SafeString | null
Optional. One of ['keys','schema','template'] - If provided, analyzes input and suggests possible values to be used in one of the other modes of parsing; does not return parsed content.

example RawArray | null
Optional. A valid JSON object/array (to derive schema from). If provided, first derives a JSON schema and then uses it as strict schema to format the parsed input against.

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Hi Jane,\n\nWe'\''re happy to let you know that your train ticket has been successfully issued. Your journey from Zone 1 to Zone 4 is scheduled to begin on the 1st of March 2025. The ticket cost was £59, and it will remain valid for one full year from the start date, giving you plenty of flexibility to plan your travel.\n\nIf you need any help or have questions about your trip, feel free to get in touch. We'\''re here to make your journey as smooth and enjoyable as possible.\n\nBest regards,\n[Your Company Name]"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "recipient": "Jane",
    "message": {
        "status": "ticket issued",
        "journey": {
            "from": "Zone 1",
            "to": "Zone 4",
            "departure_date": "2025-03-01"
        },
        "ticket": {
            "cost": "£59",
            "validity_period": "1 year",
            "start_date": "2025-03-01"
        },
        "additional_info": "The ticket remains valid for one full year from the start date, providing flexibility for planning travel."
    },
    "company": "[Your Company Name]",
    "contact_instructions": "If you need help or have questions, feel free to get in touch."
}

Parsable Content

The Parse2JSON API has one mandatory parameter, specified by the content key, which is the input data processing is to be performed upon.

This content can take one of three forms: raw text, image or pdf files, and a third special case - a JSON array or object - accepted only when deriving a schema from an already existing object structure.

For ease of use during development, the API can accept form-data fields (containing the content and other keys), but for production uses it is recommended to send in input as a JSON containing the content and subsequent necessary keys due to the increased flexibility this mode offers.

In the case where the input is a JSON and the content key specifies a file, this can be provided in one of two ways:

a base64 encoded data uri of the binary content of the file, e.g.: data:image/png;base64, •••••• or data:image/jpg;base64, •••••• or data:application/pdf;base64, ••••••
a http(s) URL to a publicly accesible file, e.g.: http://example.com/path/to/image.png, or https://example.com/path/to/document.pdf

POST /parse2json PRICE: $0.025 + TC

//you can send in form-data with a file from your machine

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Authorization: Bearer ••••••' \
-F 'content=@"/C:/Users/local/Downloads/photo.png"'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//or you can send in base64 encoded images (jpeg, png, bmp, gif, etc - any image works)

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "data:image/png;base64,iVBORw0K ••••••"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//pdfs are accepted too!

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "data:application/pdf;base64,JVBERi0xLj ••••••"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//or you can specify a url where your content is located at

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "http://example.com/path/to/image.png"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//the simplest type of input is raw text

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//if you give in a valid JSON, it will return a valid and strict JSON schema to be used in subsequent calls

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": {
        "key0": "val0",
        "key1": "val1",
        "key2": {
            "subKey0": "subVal0",
            "subKey1": "subVal1"
        }
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

Parsing Modes

When providing text or file input to the Parse2JSON API, you can further specify one of the several parsing modes available to better control the output result, each mode offering increasing more control and guarantees over the structures emitted, striking different balance points between ease of implementation and result strictness.

These modes are:

auto: inferred by not specifying any of the optional mode keys; the backing AI LLM will analyze and zero-shot extract all structured information available in the input, organizing the output in a clear simple and concise manner; most useful for one-off tasks where output structure rigor is not crucial.
prompt: allows natural language input describing what information to be extracted and how it should be structured in output; mose useful for simple, predictable input that requires low complexity structures.
template: for simple, consistent and repeatable content, this mode offers a direct replacement to approaches such as regex or dom parsing, all while benefitting from the language model's text comprehension ability; data to be extracted should be formatted with double curly braces, e.g.: {{example}}; most useful for simple string manipulation.
keys: the simplest direct mode of guiding output structure; keys can be a simple list of comma separated values (or equivallently, a JSON array of string keys) or, to ensure further consistency, a JSON object consisting of string key names and type hint values; most useful for general data extraction tasks, striking the greatest balance between ease of implementation and result structure.
schema: the ultimate form of guidance and control over output structure, ensuring output adherence to the specified format; most useful for complex content parsing, allowing for repeating and nested structures.
example: allows for bottoms up integration, inferring a schema from an existing data structure while offering all the benefits of schema mode - most useful when you already have a JSON output object generated from content you wish to parse.

A special mode available for development and integration purposes is the suggest mode - the purpose of this mode is, instead of returning actual parsed content from your input, returning suggested input parameters to be used in subsequent calls for the other parsing modes, with the purpose of aiding developers in elaborating the aformentioned parameters; this accepts one of three sub-parameters (presented as values for the suggest key):

keys: instead of returning actual parsed content from your input, analyzes the input and suggests keys that can be used in the keys extraction mode.
template: assumes the input is one sample of multiple similar items, and offers a possible templating string that can be used in template extraction mode.
schema: the most useful and important suggestion mode - analyzes input content and elaborates a valid and strict JSON schema that can be further used in schema extraction mode - since schema authoring can be quite a lengthy process, depending on what input is to be parsed, this mode offers the greatest time savings for development, allowing you to edit the suggested schema to fit your needs instead of starting from scratch with a blank one.

Note: when providing JSON content input for purposes of schema generation from the provided JSON array or object, no other modes are available - internally, a "suggest":"schema" is inferred.

POST /parse2json PRICE: $0.025 + TC

//you can skip any guidance and let the AI find the structure
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//or you can provide keys to extract as a list of comma separated values
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "keys":"key0,key1,key2"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//or you can specify the keys as a JSON array
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "keys":["key0","key1","key2"]
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//or you can specify the keys as a JSON object with type hints for values
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "keys":{
        "key0":"string",
        "key1":"currency",
        "key2":"Y-m-d H:i:s",
        "key3":"date"
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//you can direct the AI via a natural language prompt
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "prompt":"Extract key details from a travel notification email, including recipient name, ticket details (origin, destination, issue date, validity period, total cost), and any contact or support information."
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//you can specify a template to be used for extracting data out of simple, consistent and repeatable content
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "template":"Hello {{recipient_name}},\n\nWe’re pleased to inform you that your {{ticket_type}} from {{origin}} to {{destination}} has been issued."
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//the most powerful option - you can specify a JSON schema the output will be structured against
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "schema": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": { •••••• },
        "required": [ •••••• ],
        "additionalProperties": false
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

POST /parse2json PRICE: $0.025 + TC

//when you need help creating the keys, template or schema to be used, use the suggest mode
curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Lorem Ipsum ••••••",
    "suggest":"schema" //or "keys" or "template"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

The Simplest Case

The simplest way to use the Parse2JSON API for obtaining output that conforms to a given structure is to provide the content you wish to be parsed and a list of keys to be returned in the output JSON. This is best suited to non deterministic content - that is, in most cases, content written in natural language - and that has no recursive elements in it, for example parsing email or other communication text, text scraped from websites, documentation, research papers, news articles, technical instructions, product labels, marketing and promotional text or other such input.

To get a quick start using file inputs, you can send in data as form-data fields, with the file under the content input key, and the keys for extraction as a list of comma separated values in the keys input.

The immediate next level of control comes by providing the input as a JSON, with both content and keys as before, this time having the ability to provide type hints for key values - this approach provides the best balance between ease of use and the rigor provided by a full schema.

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Hi Jane,\n\nWe'\''re happy to let you know that your train ticket has been successfully issued. Your journey from Zone 1 to Zone 4 is scheduled to begin on the 1st of March 2025. The ticket cost was £59, and it will remain valid for one full year from the start date, giving you plenty of flexibility to plan your travel.\n\nIf you need any help or have questions about your trip, feel free to get in touch. We'\''re here to make your journey as smooth and enjoyable as possible.\n\nBest regards,\n[Your Company Name]",
    "keys":"customer, route, price, startDate, validityPeriod"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "customer": "Jane",
    "route": "Zone 1 to Zone 4",
    "price": "£59",
    "startDate": "2025-03-01",
    "validityPeriod": "1 year"
}

Generating A Schema

The quickest way to generate a JSON schema to structure parsed output against is to first use the Parse2JSON API in "suggest":"schema" mode: provide a sample item of the content you will parse, and the "suggest":"schema" key-value pair in the endpoint input JSON. This will analyze your input and return a suggested valid and strict JSON schema that conforms to what information the AI deduces can be extracted from your input.

You can then proceed to use the schema as given or further refine it according to what information keys, data types and structures you wish to extract. Recursive and multi level data structures are supported.

Since this schema will be used as input for the AI in parsing mode, you can provide natural langauge hints in the "description" key of extracted properties, alongside data types. Note that all properties to be extracted must be specified as required in their specific nesting block, and the additionalProperties flag must be set to false - this is to ensure that the AI follows the schema strictly and returns all the keys you expect (not more, not less). It is possible to emulate an optional parameter by using a union type with null. , e.g.: "temperatureUnitProperty": { "type": ["string", "null"], "description": "The unit to return the temperature in", "enum": ["F", "C"] }

For more information on JSON schema, consult the official spec at https://json-schema.org/learn/getting-started-step-by-step

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Hi Jane,\n\nWe'\''re happy to let you know that your train ticket has been successfully issued. Your journey from Zone 1 to Zone 4 is scheduled to begin on the 1st of March 2025. The ticket cost was £59, and it will remain valid for one full year from the start date, giving you plenty of flexibility to plan your travel.\n\nIf you need any help or have questions about your trip, feel free to get in touch. We'\''re here to make your journey as smooth and enjoyable as possible.\n\nBest regards,\n[Your Company Name]",
    "suggest":"schema"
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "type": "object",
    "properties": {
        "recipientName": {
            "type": "string",
            "description": "The name of the recipient, e.g., Jane"
        },
        "ticketStatus": {
            "type": "string",
            "description": "Status of the ticket, e.g., successfully issued"
        },
        "journey": {
            "type": "object",
            "properties": {
                "from": {
                    "type": "string",
                    "description": "Starting zone of the journey"
                },
                "to": {
                    "type": "string",
                    "description": "Destination zone of the journey"
                }
            },
            "additionalProperties": false,
            "required": [
                "from",
                "to"
            ]
        },
        "startDate": {
            "type": "string",
            "description": "Departure date in ISO 8601 format (e.g., 2025-03-01)"
        },
        "ticketCost": {
            "type": "string",
            "description": "Cost of the ticket, including currency symbol"
        },
        "validityPeriod": {
            "type": "object",
            "properties": {
                "duration": {
                    "type": "string",
                    "description": "Duration for which the ticket remains valid (e.g., 'one year')"
                },
                "startDate": {
                    "type": "string",
                    "description": "Date from which the ticket validity starts in ISO 8601 format"
                }
            },
            "additionalProperties": false,
            "required": [
                "duration",
                "startDate"
            ]
        }
    },
    "additionalProperties": false,
    "required": [
        "recipientName",
        "ticketStatus",
        "journey",
        "startDate",
        "ticketCost",
        "validityPeriod"
    ]
}

Using A Schema

Once you have obtained your JSON schema by generation and refinement or by crafting it manually, you are now ready to use it as input in the Parse2JSON API: provide the "content" key, and the JSON object that is the schema under the "schema" key.

The parser will ensure strict adherence to the structure defined, returning consistently formatted data that can be further ingested in other systems.

Upon failure or refusal to parse, the API will provide a non-200 class error code and message containing detailed explanations of the error condition.

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Hi Jane,\n\nWe'\''re happy to let you know that your train ticket has been successfully issued. Your journey from Zone 1 to Zone 4 is scheduled to begin on the 1st of March 2025. The ticket cost was £59, and it will remain valid for one full year from the start date, giving you plenty of flexibility to plan your travel.\n\nIf you need any help or have questions about your trip, feel free to get in touch. We'\''re here to make your journey as smooth and enjoyable as possible.\n\nBest regards,\n[Your Company Name]",
    "schema": {
        "type": "object",
        "properties": {
            "recipientName": {
                "type": "string",
                "description": "The name of the recipient, e.g., Jane"
            },
            "ticketStatus": {
                "type": "string",
                "description": "Status of the ticket, e.g., successfully issued"
            },
            "journey": {
                "type": "object",
                "properties": {
                    "from": {
                        "type": "string",
                        "description": "Starting zone of the journey"
                    },
                    "to": {
                        "type": "string",
                        "description": "Destination zone of the journey"
                    }
                },
                "additionalProperties": false,
                "required": [
                    "from",
                    "to"
                ]
            },
            "startDate": {
                "type": "string",
                "description": "Departure date in ISO 8601 format (e.g., 2025-03-01)"
            },
            "ticketCost": {
                "type": "string",
                "description": "Cost of the ticket, including currency symbol"
            },
            "validityPeriod": {
                "type": "object",
                "properties": {
                    "duration": {
                        "type": "string",
                        "description": "Duration for which the ticket remains valid (e.g., '\''one year'\'')"
                    },
                    "startDate": {
                        "type": "string",
                        "description": "Date from which the ticket validity starts in ISO 8601 format"
                    }
                },
                "additionalProperties": false,
                "required": [
                    "duration",
                    "startDate"
                ]
            }
        },
        "additionalProperties": false,
        "required": [
            "recipientName",
            "ticketStatus",
            "journey",
            "startDate",
            "ticketCost",
            "validityPeriod"
        ]
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "recipientName": "Jane",
    "ticketStatus": "successfully issued",
    "journey": {
        "from": "Zone 1",
        "to": "Zone 4"
    },
    "startDate": "2025-03-01",
    "ticketCost": "£59",
    "validityPeriod": {
        "duration": "one year",
        "startDate": "2025-03-01"
    }
}

Inferring Schema

A quick and dirty way to bypass the schema generation - edit - usage process, if you already have an example of a valid JSON object that you would like to be emulated as structure and data types when formatting the output against is to use an "example": provide the content to be parsed as usual, and the sample JSON object under the "example" key.

This internally does two steps: first, it generates a schema from the JSON object provided, and secondly, uses that schema for structuring the parsed output against.

This approach is most useful for one off or low volume content parsing when it is not needed or desired to go through the rigor of the schema generation process - the downside is that processing takes longer and costs more, because as opposed to all the other usage modes of the Parse2JSON API, this operation does two chained instantiations of the AI model, using the output of the first as input to the second.

For a more cost effective approach that ensures reduced processing times it is recommended to elaborate a schema first and then use it directly as input - either through suggestion, authoring, or by converting a JSON output object to a JSON schema (see below).

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "Hi Jane,\n\nWe'\''re happy to let you know that your train ticket has been successfully issued. Your journey from Zone 1 to Zone 4 is scheduled to begin on the 1st of March 2025. The ticket cost was £59, and it will remain valid for one full year from the start date, giving you plenty of flexibility to plan your travel.\n\nIf you need any help or have questions about your trip, feel free to get in touch. We'\''re here to make your journey as smooth and enjoyable as possible.\n\nBest regards,\n[Your Company Name]",
    "example": {
        "recipientName": "John",
        "ticketStatus": "issued",
        "journey": {
            "fromZone": "London City",
            "toZone": "Nottingham",
            "startDate": "2025-05-18"
        },
        "ticket": {
            "cost": "£258",
            "validityPeriod": "1 year"
        },
        "contact": {
            "message": "Should you need any assistance with your travel plans or have any questions about your ticket, don’t hesitate to reach out. We’re always here to help."
        },
        "company": "[Your Company Name]"
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "recipientName": "Jane",
    "ticketStatus": "successfully issued",
    "journey": {
        "fromZone": "Zone 1",
        "toZone": "Zone 4",
        "startDate": "2025-03-01"
    },
    "ticket": {
        "cost": "£59",
        "validityPeriod": "1 year from the start date"
    },
    "contact": {
        "message": "If you need any help or have questions about your trip, feel free to get in touch."
    },
    "company": "[Your Company Name]"
}

Object To Schema Conversion

The Parse2JSON API can be used to generate a JSON schema from a JSON object - this is useful when you take a bottoms up approach in developing your parsing pipeline: you have content to be parsed and instead of authoring a schema you've already created an example JSON object of how you wish your output to look like; the schema is required as input for structuring the parsed content against and this approach allows you to automatically generate it from the final output format desired.

To use this feature, provide your JSON object example in the "content" key without any other parameters - it is the only mode in which the API accepts an object for content. The return will be the reverse engineered JSON schema to be used as input in subsequent calls.

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": {
        "recipientName": "John",
        "ticketStatus": "issued",
        "journey": {
            "fromZone": "London City",
            "toZone": "Nottingham",
            "startDate": "2025-05-18"
        },
        "ticket": {
            "cost": "£258",
            "validityPeriod": "1 year"
        },
        "contact": {
            "message": "Should you need any assistance with your travel plans or have any questions about your ticket, don’t hesitate to reach out. We’re always here to help."
        },
        "company": "[Your Company Name]"
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
    "type": "object",
    "properties": {
        "recipientName": {
            "type": "string",
            "description": "Name of the recipient"
        },
        "ticketStatus": {
            "type": "string",
            "description": "Current status of the ticket"
        },
        "journey": {
            "type": "object",
            "properties": {
                "fromZone": {
                    "type": "string",
                    "description": "Starting zone of the journey"
                },
                "toZone": {
                    "type": "string",
                    "description": "Destination zone of the journey"
                },
                "startDate": {
                    "type": "string",
                    "description": "Start date of the journey in YYYY-MM-DD format"
                }
            },
            "required": [
                "fromZone",
                "toZone",
                "startDate"
            ],
            "additionalProperties": false
        },
        "ticket": {
            "type": "object",
            "properties": {
                "cost": {
                    "type": "string",
                    "description": "Cost of the ticket, inclusive of currency symbol"
                },
                "validityPeriod": {
                    "type": "string",
                    "description": "Validity period of the ticket"
                }
            },
            "required": [
                "cost",
                "validityPeriod"
            ],
            "additionalProperties": false
        },
        "contact": {
            "type": "object",
            "properties": {
                "message": {
                    "type": "string",
                    "description": "Contact message for assistance"
                }
            },
            "required": [
                "message"
            ],
            "additionalProperties": false
        },
        "company": {
            "type": "string",
            "description": "Company issuing the ticket"
        }
    },
    "required": [
        "recipientName",
        "ticketStatus",
        "journey",
        "ticket",
        "contact",
        "company"
    ],
    "additionalProperties": false
}

Putting It All Together

As a complete example, let us assume a common use case: digitizing physical documents - in our case, receipts.

It is sufficient to take legible photos of each receipt - they don't need to be scans - lighting, text slanting, kerning and other such factors do not affect the result given the AI's builtin OCR capabilities.

Once such a database of images is created, the Parse2JSON integration workflow for obtaining structured digital output from each input image would be as follows:

In development:

do one call, providing one content item (one receipt photo), with "suggest":"schema" mode enabled. This will take in the receipt, analyze it for content and structure and return a valid JSON schema for the information extracted
review and if needed modify the obtained schema per your requirements, adjusting for your particular receipt formats or expected return fields
do another call, with the same input as before (the same receipt), but this time with the schema you have authored, under the "schema" mode parameter; observe the output - ensure you are satisfied with the returned structure, comparing the input image versus the output JSON.

Repeat the process as necessary, testing with different inputs while maintaining the same schema, evolving it as needed. Once you are happy with the structure and data extracted, you are ready to integrate the Parse2JSON API in your production code.

In production:

for each of the items you wish to parse, do one call providing the input content and the schema elaborated above; by providing the same input schema, you will ensure all items will be parsed returning a similar result that you can further process according to your needs, in a unified manner across all inputs.

POST /parse2json PRICE: $0.025 + TC

curl -L -X POST 'https://api.aramid.io/parse2json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ••••••' \
-d '{
    "content": "data:image/png;base64,iVBORw0K ••••••",
    "schema": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            "title": {
                "type": "string",
                "description": "The title of the document or invoice"
            },
            "tagline": {
                "type": "string",
                "description": "The company tagline"
            },
            "slogan": {
                "type": "string",
                "description": "The company slogan"
            },
            "company": {
                "type": "string",
                "description": "Name of the company"
            },
            "address": {
                "type": "string",
                "description": "Company address"
            },
            "CIF": {
                "type": "string",
                "description": "Company tax identification number"
            },
            "items": {
                "type": "array",
                "description": "List of items included in the transaction",
                "items": {
                    "type": "object",
                    "properties": {
                        "name": {
                            "type": "string",
                            "description": "Name of the item"
                        },
                        "quantity": {
                            "type": "integer",
                            "description": "Quantity of the item"
                        },
                        "unit_price": {
                            "type": "number",
                            "description": "Price per unit of the item"
                        },
                        "total_price": {
                            "type": "number",
                            "description": "Total price for the item"
                        }
                    },
                    "required": [
                        "name",
                        "quantity",
                        "unit_price",
                        "total_price"
                    ],
                    "additionalProperties": false
                }
            },
            "command": {
                "type": "string",
                "description": "Command code or reference"
            },
            "operator": {
                "type": "string",
                "description": "Operator or sales person identifier"
            },
            "total": {
                "type": "number",
                "description": "Total amount for the transaction"
            },
            "payment": {
                "type": "object",
                "description": "Payment details",
                "properties": {
                    "card": {
                        "type": "number",
                        "description": "Amount paid via card"
                    },
                    "rest": {
                        "type": "number",
                        "description": "Remaining amount to be paid"
                    }
                },
                "required": [
                    "card",
                    "rest"
                ],
                "additionalProperties": false
            },
            "taxes": {
                "type": "object",
                "description": "Tax details with VAT rates",
                "properties": {
                    "VAT A - 19%": {
                        "type": "number",
                        "description": "VAT at 19%"
                    },
                    "VAT B - 9%": {
                        "type": "number",
                        "description": "VAT at 9%"
                    },
                    "Total VAT": {
                        "type": "number",
                        "description": "Total VAT amount"
                    }
                },
                "required": [
                    "VAT A - 19%",
                    "VAT B - 9%",
                    "Total VAT"
                ],
                "additionalProperties": false
            },
            "transaction": {
                "type": "object",
                "description": "Transaction identifiers and details",
                "properties": {
                    "Z": {
                        "type": "string",
                        "description": "Z code of the transaction"
                    },
                    "BF": {
                        "type": "string",
                        "description": "BF code of the transaction"
                    },
                    "S/SN": {
                        "type": "string",
                        "description": "Serial number of the transaction"
                    },
                    "OR": {
                        "type": "string",
                        "description": "Order time"
                    },
                    "T": {
                        "type": "string",
                        "description": "Transaction timestamp"
                    },
                    "cashier": {
                        "type": "string",
                        "description": "Cashier identifier"
                    }
                },
                "required": [
                    "Z",
                    "BF",
                    "S/SN",
                    "OR",
                    "T",
                    "cashier"
                ],
                "additionalProperties": false
            },
            "additional_info": {
                "type": "object",
                "description": "Additional transactional info",
                "properties": {
                    "date": {
                        "type": "string",
                        "description": "Transaction date"
                    },
                    "time": {
                        "type": "string",
                        "description": "Transaction time"
                    }
                },
                "required": [
                    "date",
                    "time"
                ],
                "additionalProperties": false
            }
        },
        "required": [
            "title",
            "tagline",
            "slogan",
            "company",
            "address",
            "CIF",
            "items",
            "command",
            "operator",
            "total",
            "payment",
            "taxes",
            "transaction",
            "additional_info"
        ],
        "additionalProperties": false
    }
}'

TC = $0.000001 / inTok + $0.000004 / outTok

RESPONSE

{
  "title": "Digital Plumbers",
  "tagline": "YOUR DIGITAL INFRASTRUCTURE",
  "slogan": "WE CODE WELL, WE DO GOOD",
  "company": "Digital Plumbers SRL",
  "address": "404 Firewall Street, District 2, Bucharest",
  "CIF": "RO404404404",
  "items": [
    {
      "name": "Bits Package",
      "quantity": 4,
      "unit_price": 5.5,
      "total_price": 22
    },
    {
      "name": "Compressed Bytes",
      "quantity": 2,
      "unit_price": 5.99,
      "total_price": 11.98
    },
    {
      "name": "Mini Data Pipeline",
      "quantity": 1,
      "unit_price": 2.5,
      "total_price": 2.5
    },
    {
      "name": "Code Chip Cookie",
      "quantity": 1,
      "unit_price": 4.99,
      "total_price": 4.99
    },
    {
      "name": "Antistatic Bag",
      "quantity": 1,
      "unit_price": 1,
      "total_price": 1
    },
    {
      "name": "CPU Coolant (500ml)",
      "quantity": 1,
      "unit_price": 4.5,
      "total_price": 4.5
    }
  ],
  "command": "#DP352294",
  "operator": "#SALES",
  "total": 46.97,
  "payment": {
    "card": 46.97,
    "rest": 0
  },
  "taxes": {
    "VAT A - 19%": 0.16,
    "VAT B - 9%": 3.8,
    "Total VAT": 3.96
  },
  "transaction": {
    "Z": "0852",
    "BF": "0517",
    "S/N": "DP4700026189",
    "OR": "13:28:19",
    "T": "00641520",
    "cashier": "BOT CASHIER 1"
  },
  "additional_info": {
    "date": "2025-04-15",
    "time": "13:28:19"
  }
}