Complete Guide
Overview
Ajna Analytical Engine is a high-performance Python library for executing complex analytical queries across multiple databases. Built on Polars for speed and featuring comprehensive query capabilities.Installation & Setup
Installation
Required Configuration
The engine requires a datasource configuration file (YAML) that defines your database connections and table schemas:Database Support
Production-Ready Databases
- PostgreSQL - Full support with connection pooling
- MySQL - Full support with connection pooling
Enterprise Databases (via Connection API)
- AWS Athena - IAM roles, access keys, workgroups
- Google BigQuery - Service accounts, projects, datasets
- Snowflake - Username/password, key-pair authentication
- Azure Synapse - Managed identity, service principals
- Amazon Redshift - IAM authentication
- SQLite - File-based databases
Authentication Methods
Database Authentication
Cloud Authentication
Enterprise Connection Management
Query Types
1. Basic SELECT Queries
2. JOIN Operations
All join types supported:inner, left, right, full, cross
3. Aggregations
Supported functions:sum, avg, count, min, max, count_distinct, stddev, variance, median, percentile, mode
4. Window Functions
Full window function support:row_number, rank, dense_rank, lag, lead, first_value, last_value, ntile
5. Subqueries
Supported types:scalar, in, not_in, exists, not_exists, derived_table, any, all
6. Case Expressions
7. String Functions
Supported:concat, substring, upper, lower, trim, length, replace, split
8. Mathematical Functions
Supported:abs, round, ceil, floor, power, sqrt, mod, sin, cos, tan, greatest, least
9. Date/Time Functions
Supported:extract, year, month, day, date_add, date_sub, datediff, now, current_date
10. NULL Handling Functions
Supported:isnull, coalesce, nullif, ifnull
11. Common Table Expressions (CTEs)
SQL Conversion
Convert existing SQL queries to native format:Configuration
Datasource Configuration
Engine Configuration (Optional)
Performance & Caching
Query Result Caching
Results are automatically cached based on query signature:Performance Monitoring
Error Handling
The engine provides detailed error information:Best Practices
- Always provide datasource configuration - it’s mandatory
- Use query limits to prevent accidentally large result sets
- Enable caching for frequently executed queries
- Monitor performance with built-in metrics
- Use connection pooling for production databases
- Validate queries during development using
engine.validate_query() - Use environment variables for credentials in production
