POS TLog Parser
I am releasing this parser I coded in perl under the GPL.
Hopefully, it will be useful to someone who does not work at a
backwards company. My company went out of business sucking up to Bill Gates and Micro$oft, and has foregone all claim or interest in this
This whole project was (for me anyway) a
demonstration of the superiority of open-platforms/open-source
software. At my company, we had hundreds of IBM's 4690 POS
(Point Of Sale) systems across the country running IBM's Supermarket
application (no, it's not a supermarket - just retail). The
numerous registers send transactions in real time to this application
which appends them to a Transaction Log (henceforth, called TLog) in a
proprietary IBM format.
The trick for me was discovering how to unpack "packed-decimal" fields
in perl. IBM produces a spec for the default record
layouts, but leaves plenty of options for customization.
Project inception, the company purchased numerous high-dollar
commercial and contractual services to put this together. The
TLog parsing job went to [a commercial company]. I had to match enough
of that layout to be compatible with the database infrastructure we had already
1) Stores TLogs go in separate directories, labeled "Store_###", under a defined root directory.
2) TLog files accumulate in each directory. They have numerical names which grow sequentially.
3) TLogs are parsed into separate files, each representing a different record type.
4) Description of the output files and layout is in an *.ini
5) A keyed binary kept tract of the last TLOG sequence parsed.
The Nightmare of Commercial Software
In practice, this approach had numerous drawbacks. The program
was not stable and frequently hung, with no indication of what was
wrong. The program was not robust and would crash and output
garbage for any minor discrepancies in the data. Fixing
broken parsing runs was difficult and often caused more problems ;
editing the binary START file often failed. The compiled
program exhibited typical problems representing poor programming
practices - memory leaks, hangs, slow downs. Evolving
business processes meant new records and fields were constantly being
added to the tlog. Each minor fix or enhancement turned into an
expensive, delayed support request often resulting in unexpected new
problems requiring a complete end to end QA.
A Better Way
Developing a perl solution to duplicate [the commercial product] took me about 2 weeks
on linux, except the company required it to run on Windows.
Porting to Windows (and making it production ready) took another 6
weeks. First I had to duplicate the output of [the commercial product] for
every conceivable record (required adding silly hacks to my clean code
to duplicate their bugs), and then modify it for new TLog
requirements. [the commercial company] could not deliver a reasonable estimate
for the same enhancement to their code, so my script replaced it, about
3 months after I had started the development. Of course,
done right, my script offers many improvements over [the commercial product]:
1) The process is split into two distinct parts - an input filter which
converts TLogs into simple text records, and an output filter which generates
plain text in csv formatted files. The input filter duplicates
the TV++ (a Windows program) functionality (altho better) and can be used for
many other purposes.
2) The perl process has been completely stable - it has never crashed
or hung. Believe it or not, the perl process is about twice as
fast as the compiled [commercial product]. The DBA's are happy
because they can call it as needed from MS SQL Server DTS packages.
3) Both input and output formats are described by .ini files. Unlike [the commercial product] they are functional
descriptions of the formats. The data definitions drive the logic
of the process. 95% of the changes made in the TLogs have been
handled by simple changes (by non-programmers) with any file editor to
the ini files.
4) Consequently, changes are free and take only a few minutes. (instead of $50,000 and 3 months!)
5) A simple text file called START specifies the last TLog processed
and byte count. (Hint: if the same TLog grows it is
re-parsed. A parallel project parses current day TLogs thru-out
the day) It is easy to change this file and parse any number of
previous days tlogs, eg., if a format changes and the ini needed to be
updated. Note - corrupted or different formats are just ignored
and parsing continues, unlike [the commercial product] which would just crash.
6) Built into all my scripts is a complete debugging capability.
This proved invaluable to resolve parsing problems and isolate bizarre
data from the POS controllers. There was no such feature in
[the commercial product].
Open Source vs. Proprietary Commercial Software
No legal encumbrances. Vastly superior development and
maintenance model. Faster, more stable and robust, less error
prone. Cross-platform. Re-useable components.
Complete control for no marginal costs.
Clearly you can see why management was so upset with my new program. I had been an
open source advocate at the company for 8 years. This program
was an opportunity to prove it, and it succeeded
spectacularly. As soon as they realized what I had done,
they tried to replace it. They tasked a team member to duplicate
[the commercial product] with another commercial tool (some expensive data mapping
software called Mercator). Nine months later with the help of
numerous others and vendor support, he had a working
parser. Tests were not convincing as the [the commercial product] output
could not be duplicated. Now 3 years later, M$ has
footed the bill to duplicate the parsing process again in SSIS for
Yukon: Project Real
Makes you wonder why they were so eager to jump out of the Ferrari they owned and back into a rental Pinto.
contact me - kdavenpo at tx dot rr dot com