Defining Data Sources and Process Levels
Let's understand the processing levels for different data source types.
Text Files
Step 1: Select the Process Data Source Type as "Text Files".
Step 2: Click <icon> to define the data source. In the Text Extractor wizard, expand the Text Extractor to locate the entity.
Example: A bank has a large text file containing all credit card statements for the month.
===============================
Customer account details
===============================
--------------------------------
MAIN ACCOUNT SUMMARY
--------------------------------
Customer Name : Mr. R. Kumar
Customer ID : 78912345
Card Number : XXXX XXXX XXXX 4567
Statement Month : November 2025
Credit Limit : ₹ 2,00,000
Available Limit : ₹ 1,57,800
Total Due : ₹ 42,200
Minimum Amount Due : ₹ 4,220
Due Date : 10-Dec-2025
--------------------------------
TRANSACTION HISTORY
--------------------------------
# Transaction 1
Date : 04-Nov-2025
Merchant : Amazon India
Description : Online Purchase - Electronics
Amount : ₹ 15,500
Status : Posted
# Transaction 2
Date : 10-Nov-2025
Merchant : Apollo Pharmacy
Description : Medical Purchase
Amount : ₹ 1,245
Status : PostedThis file has information in two levels:
Main Account (Customer summary)
Sub Accounts / Transaction history
To extract this data correctly, the system needs to know where each part begins and ends. This is what Entities help with.
Entity One is the default entity. It represents Parent Account Information. Click it and specify the identifier and other properties to tell the system how to identify and extract the data from a text file.
Defining the Entity Properties:
Select the Type to specify the beginning of an entity (Identifier / Line Count) Use: Identifier, when you can locate the section by a specific word or phrase. Line Count, when the section always begins after a fixed number of lines, with no unique text to search for.
Here's a quick reference to understand what Type to select and when to enable Match Case/Match word:
Identifier
When a section contains a unique text pattern that can be searched
Parent account details always begin with the label “CUSTOMER ACCOUNT DETAILS”.
Identifier + Match Case/Match Word
When the system should only match exact text
Transaction rows contain “DEBIT” in uppercase. Using Match Case ensures the system doesn’t pick “Debit” or “debit”.
Line Count
When the file uses a fixed layout, and the next section always begins after a set number of lines.
A sub-account record always begins 5 lines below the parent record. Set Line Count = 5 to capture the sub-account details.
Adding additional entities: Let's understand where and why do we require additional entities. For example, if the bank wants to extract additional information such as Reward points summary, EMI details, etc., add more entities by clicking '+' in the parent entity.
--------------------------------
REWARD POINTS SUMMARY
--------------------------------
Total Reward Points Earned This Month : 1,250
Total Reward Points Available : 8,940
Equivalent Cashback Value : ₹ 894
Recent Earned Points:
• Amazon Purchase (₹ 15,500) → 775 Points
• Fuel (₹ 1,800) → 180 Points
• Others → 295 Points
Redemption Options Available:
- Cashback
- Flight Miles
- Gift Vouchers
- Utility Bill Payments
--------------------------------
EMI DETAILS
--------------------------------
EMI #1
Description : Mobile Phone Purchase (Flipkart)
Purchase Amount : ₹ 36,000
EMI Amount : ₹ 3,000 / month
Tenure : 12 Months
Remaining Months : 6
Interest Rate : 14%Each added entity helps the system separate sections like this: Entity One --> Account summary Entity Two --> Transactions Entity Three --> Rewards details Entity Four --> EMI details
<<IMAGE - Adding a new entity>>
As you add entity, they are created in a hierarchical model acting as Parent & Sub accounts.
Once created, define the properties for each entity and click OK to continue.
You cannot rename the entity once created.
You can delete the unused entities.
Step 3: Define process levels. To define, select the default (Level1) and select its value in the Properties. <<IMAGE - Level 1 Processing>> If your data source has another level inside it, click '+' to add another level for processing and specify its properties. <<IMAGE - Level 2 Processing>> Follow the same process to define all the levels.
XML
Select the Process Data Source Type as "XML" and follow the same process as specified for Text Files data source.
Here's the sample XML structure:
<CustomerAccountDetails>
<CustomerSummary>
<CustomerName>Mr. R. Kumar</CustomerName>
<CustomerID>78912345</CustomerID>
<CardNumber>XXXX XXXX XXXX 4567</CardNumber>
<StatementMonth>November 2025</StatementMonth>
<CreditLimit>200000</CreditLimit>
<AvailableLimit>157800</AvailableLimit>
<TotalDue>42200</TotalDue>
<MinimumAmountDue>4220</MinimumAmountDue>
<DueDate>10-Dec-2025</DueDate>
</CustomerSummary>
<Transactions>
<Transaction>
<Date>04-Nov-2025</Date>
<Merchant>Amazon India</Merchant>
<Description>Online Purchase - Electronics</Description>
<Amount>15500</Amount>
<Status>Posted</Status>
</Transaction>
<Transaction>
<Date>10-Nov-2025</Date>
<Merchant>Apollo Pharmacy</Merchant>
<Description>Medical Purchase</Description>
<Amount>1245</Amount>
<Status>Posted</Status>
</Transaction>
<Transaction>
<Date>22-Nov-2025</Date>
<Merchant>Big Bazaar</Merchant>
<Description>Grocery & Household</Description>
<Amount>5600</Amount>
<Status>Posted</Status>
</Transaction>
</Transactions>
<RewardPoints>
<EarnedThisMonth>1250</EarnedThisMonth>
<Available>8940</Available>
<CashbackValue>894</CashbackValue>
</RewardPoints>
<EMIDetails>
<EMI>
<Description>Mobile Phone Purchase (Flipkart)</Description>
<PurchaseAmount>36000</PurchaseAmount>
<EMIAmount>3000</EMIAmount>
<RemainingMonths>6</RemainingMonths>
<InterestRate>14%</InterestRate>
</EMI>
</EMIDetails>
</CustomerAccountDetails>JSON
Step 1: Select the Process Data Source Type as "JSON", when the input is structured JSON.
Step 2: Click <icon> to define the data source. In the JSON Extractor wizard, expand the JSON Extractor to locate the entity.
Example: A bank's JSON feed containing Customer Information, Monthly Account Summary, and Transactions.
{
"customer": {
"name": "John",
"accountNumber": "1234567890",
"statementMonth": "November 2025"
},
"summary": {
"openingBalance": 20000,
"closingBalance": 25000
},
"transactions": [
{
"date": "2025-01-03",
"description": "POS PURCHASE",
"amount": 1500
},
{
"date": "2025-01-11",
"description": "ATM WITHDRAWAL",
"amount": 2000
}
]
}Defining the Entity Properties:
Select the Type as "Identifier" to specify the beginning of an entity. For JSON files, '{' becomes the identifier or you can specify the line count.
Step 3: Define process levels.
To define, select the default (Level1) and select its value in the Properties. <<IMAGE - Level 1 Processing>>
Enable "Content Authoring" only when you want to let your users change the content at text /image level without working with the template directly. --> Check with Karu
If enabled, specify the design workflow and select the server for batch processing.
To manage batch configurations, click <icon>. Click here to know more about batch configurations.