NiFi users quickly learn that FlowFiles have built-in fields like uuid and filename, because these are obviously visible in the UI, and referenced in many Expression Lanaguage examples.
Only after I read Mark Payne's answer to a StackOverflow question about the lineageStartDate field did I appreciate that there might be ways to reference internal fields from Expression Language that are not obvious from the UI. So I hunted through some code to find out which fields they were and where they came from.
While you can do a lot in NiFi without knowing these fields, an exploration of where they come from develops a better understanding of how the fields visible in the UI match to NiFi internals.
Values in Expression Language
First, I tried to find how internal values of FlowFiles become available in Expression Language. Or remain unavailable? We need to take a look at the ValueLookup class, which is where FlowFile data is exposed to EL. In ValueLookup, you can see several sources of EL values: EL ValueLookup:66
- Attributes - As part of the same user-modifiable attribute set, some attributes are added by default.
uuid
andfilename
are probably the best examples. - Properties - FlowFiles also have intrinsic properties, a select number of which are exposed through Expression Language. The names of these variables is set when they are added in ValueLookup, and the names may be different from the class variable names in code, and different from the descriptive names given in the UI.
I followed both default attributes and properties a bit more.
Hidden Fields?
NiFi users know that FlowFiles have built-in fields like uuid
and filename
, because these are obviously visible in
the UI and referenced in many Expression Lanaguage examples.
Only when I read Mark Payne's answer to a StackOverflow question about
lineageStartDate
did I appreciate that there might be additional fields.
I hunted through some code to find out what they were and where they came from.
In fact, the Expression Language module loads both "properties" and "attributes" of FlowFiles.
- Attributes - As part of the same user-modifiable attributes, some attributes are added by default.
uuid
andfilename
are probably the best examples. - Properties - FlowFiles also have intrinsic properties, a small number of which are exposed in Expression Language.
Core Attributes
Attributes are mostly left to the user to define, use, and abuse however they choose. But there are a small number of "Core Attributes" that are defined by the framework's CoreAttributes enumeration. An even smaller subset of these attributes are initialized with default values in the StandardProcessSession.create() method.
Core Attribute | Description |
---|---|
uuid | Read-only. Initialized to UUID.randomUUID() |
filename | Set by default to System.nanoTime() |
path | Relative directory portion of path, excluding the leaf file name. Default is ./ |
absolute.path | Absolute directory portion of path, excluding the leaf file name. No default. |
priority | Used by PriorityAttributePrioritizer, no default |
mime.type | Widely used by processors. No default. |
discard.reason | No default |
alternate.identifier | No default |
uuid
is probably the most special, in that it also has protection from being overwritten.
All attribute are of type String, even if they store numbers.
And all are displayed in the UI when present, just like any user-specified attributes.
Properties
But wait! These attributes are in plain sight, didn't I promise you "hidden" stuff? Yes, and that's where we come to the FlowFile "properties" referenced in the Expression Language's ValueLookup class since NiFi 1.0.0. Properties are intrinsic data values of FlowFiles, some are made available in Expression Language:
Name in Expression Language | Field Name (UI) | Description |
---|---|---|
flowFileId | (no shown) | internal identifier |
fileSize | File Size | Size of content, in bytes |
entryDate | File Size | Milliseconds since flowfile entered this NiFi |
lineageStartDate | Used to calculate Lineage Duration | Timestamp milliseconds |
lastQueueDate | Used to calculate Queue Duration | Timestamp when last placed in queue |
queueDateIndex | Queue Position | Offset in queue |
In contrast to attributes, properties are less visible in that they are not echoed verbatim in the UI. Or at least less visible by their Expression Language names.
Examples
There is no substitute for trying it out, here are some examples of Expression Language that use these fields.
Using lineageStartDate
to capture the length of time a FlowFile has been in NiFi:
${now():toNumber():minus(${lineageStartDate}):format("HH:mm:ss")}
Size of a FlowFile in KB, by integer division:
${fileSize:toNumber():divide(1024)}
I've saved a NiFi template as a Gist that uses Expression Language to evaluate all of the fields and the examples above.