OpenCReport separates data access into two entities: a data source driver and a query.
OpenCReports supports diverse data sources:
SQL is the acronym for Standard Query Language. Many database software comply with the standard to a certain extent. The standard is occasionally revised, and a certain database software version complies to a specific version of the standard to a certain extent.
In general, database software are designed to store massive amounts of data and retrieve it as fast as possible. Database software and its data can be accessed through a network connection (even if it's installed in the same machine) or a faster local connection if both the database server and client are installed on the same computer.
The SQL based data sources OpenCReports natively supports are:
Any SQL database server with a compliant ODBC driver
MariaDB is a fork of MySQL developing in a different direction but still maintaining strong compatibility with each other. The database client library is compatible with both, therefore OpenCReports supports both with the same driver.
PostgreSQL (in their own words) is The World's Most Advanced Open Source Relational Database. OpenCReports supports using PostgreSQL.
The PostgreSQL datasource driver in OpenCReports is
especially economic with memory by using a
WITH HOLD cursor
and the PostgreSQL specific
FETCH count
SQL statement to retrieve a specified number of rows in
one round, as opposed to retrieving every row in one
round that most database software supports.
This makes the report run slower for longer query results compared to other databases, but allows generating the report from extremely large datasets when other databases may fail due to memory exhaustion.
OpenCReports supports using a generic ODBC connection to any database servers using a compliant client driver. OpenCReports uses a standard ODBC manager library, so it is able to use any ODBC DSN (Data Source Name) configured for the system or the user.
The database client libraries for MariaDB, PostgreSQL and ODBC return all the query rows from the database server at once by default. As such, it is possible that a long query result doesn't fit into the computer memory.
The report needs to traverse the query result twice to pre-compute "delayed" values (see precalculated and Precalculated variables), so it needs to be able to rewind the data set once it was read to the end.
The alternative API in MariaDB to load the rows one by one doesn't allow rewinding, so it's not usable for the report's purposes.
It is only PostgreSQL that allows using an
SQL cursor as a standalone entity,
i.e. outside SQL procedures as defined by the SQL
standard. This PostgreSQL extension to the standard
allows saving memory in such a way that it allows
processing very long query results. Behind the scenes,
a WITH HOLD
cursor is used and
1024 rows are loaded in one go from the server.
The file based data sources OpenCReports supports are:
Comma-separated values
a.k.a. CSV
eXtensible Markup Language
a.k.a. XML
JavaScript Object Notation
a.k.a. JSON
spreadsheet formats, like XLS, XLSX and ODS
The XML
and JSON
file types expect the data presented in a certain
structure syntax. The
semantics is application defined.
The expected format for these file types are described
below.
CSV ("Comma Separated Values") is a simple tabulated file format. Every line must have the same number of columns, the values are separated by commas. The first line in the file contains the column names.
Using only the CSV file, the data type cannot be determined. Because of this, every column is assumed to be a string, regardless if the values themselves are quoted or not in the file. Data conversion functions must be used, see for example Section 4.12.4, Section 4.10.16 and Section 4.10.10.
Using either report XML description or programming code, an optional set of type indicators may be added along with the CSV input file, so the explicit conversion functions may be omitted from expressions using the data.
A JSON file is expected in this format:
{ "columns": ["colname1", ... ], "coltypes": ["type", ... ], "rows": [ { "colname1": value1, ... }, ... ] }
The JSON file is expected to list the column
names in a string array called columns
.
The column types are optionally listed in the string
array called coltypes
. If they are
listed, the coltypes
array must have
the same number of strings as the columns
array. The type names are string
,
number
or datetime
.
If the column type array is missing, then all data
values are assumed to be strings and data conversion
functions must be used, see for example
Section 4.12.4,
Section 4.10.16 and
Section 4.10.10.
The data rows are listed in a JSON array
called rows
and column data values for
each row are in a JSON collection with
data names from the columns
and
data types from the coltypes
arrays.
The sections columns
,
coltypes
and
rows
may appear in any order.
When the coltypes
part is missing
from the JSON input file, then using either report XML
description or programming code, an optional set of type
indicators may be added along with the JSON input file,
so the explicit conversion functions may be omitted from
expressions using the data.
An XML file datasource is expected in this format:
<?xml version="1.0"?> <data> <rows> <row> <col>value</col> ... </row> ... </rows> <fields> <field>column1</field> ... </fields> <coltypes> <col>type1</col> ... </coltypes> </data>
The XML section names <data>
,
<rows>
and
<fields>
are the same as they were in RLIB for its XML data
source. The order of <rows>
and <fields>
is not important.
But the order of field names in <fields>
must match the column value order in each
<row>
.
The optional section <coltypes> is new in
OpenCReports. If it's present, then it must list
the data types in the same order as section
<fields>
.
The types may be string
,
number
or datetime
.
If this section is not present, all values are assumed
to be strings and data conversion functions must be
used, see for example
Section 4.12.4,
Section 4.10.16 and
Section 4.10.10.
When the coltypes
part is missing
from the XML input file, then using either report XML
description or programming code, an optional set of type
indicators may be added along with the XML input file,
so the explicit conversion functions may be omitted from
expressions using the data.
OpenCReports also supports various spreadsheet formats
as datasources. For that, it relies on the Python
pandas
module and the supporting
modules for the actual spreadsheet file format.
Such Python modules are xlrd
for
the older Microsoft XLS format,
pyopenxl
for the newer Microsoft
XLSX format, and odfpy
for LibreOffice
ODS format. Other modules may also be used to support
other spreadsheet file formats.
Applications may also have internal data that can be used as input for OpenCReports.
OpenCReports supports using two-dimensional C arrays as directly accessible application data. Such arrays must be declared as
char *array[ROWS][COLUMNS]
or converted to it if using OpenCReports from a different language. Each element is a pointer to a zero-terminated C string. The first row contains the names of columns.
Optionally, a set of type indicators may be supplied, similarly to the File based data sources.
OpenCReports allows application defined datasource drivers that may even override built-in datasource drivers.
An application defined data source may be any of the previously listed types: SQL, file or data based.